关于java:为什么迭代List会比索引更快?

Why would iterating over a List be faster than indexing through it?

阅读ADT列表的Java文档:

The List interface provides four methods for positional (indexed) access to list elements. Lists (like Java arrays) are zero based. Note that these operations may execute in time proportional to the index value for some implementations (the LinkedList class, for example). Thus, iterating over the elements in a list is typically preferable to indexing through it if the caller does not know the implementation.

这到底是什么意思?我不明白得出的结论。


在链接列表中,每个元素都有指向下一个元素的指针:

1
head -> item1 -> item2 -> item3 -> etc.

要访问item3,您可以清楚地看到,您需要从头部穿过每个节点,直到到达项目3,因为您不能直接跳转。

因此,如果我想打印每个元素的值,如果我写下:

1
2
3
for(int i = 0; i < 4; i++) {
    System.out.println(list.get(i));
}

发生的情况是:

1
2
3
4
head -> print head
head -> item1 -> print item1
head -> item1 -> item2 -> print item2
head -> item1 -> item2 -> item3 print item3

这是非常低效的,因为每次索引时,它都会从列表的开头重新启动,并遍历每个项目。这意味着,您的复杂性实际上只是为了遍历列表而已。

如果我这样做了:

1
2
3
for(String s: list) {
    System.out.println(s);
}

接下来发生的是:

1
head -> print head -> item1 -> print item1 -> item2 -> print item2 etc.

全部在单个遍历中,即O(N)

现在,转到List的另一个实现,它是ArrayList,它由一个简单的数组支持。在这种情况下,上述两个遍历都是等效的,因为数组是连续的,所以它允许随机跳转到任意位置。


答案在这里是隐含的:

Note that these operations may execute in time proportional to the index value for some implementations (the LinkedList class, for example)

链表没有一个固有的索引;调用.get(x)将需要链表实现来找到第一个条目并调用.next()x-1次(对于O(N)或线性时间访问),其中数组支持的链表可以在O(1)或常量时间内索引到backingarray[x]

如果你看JavaDocforLinkedList,你会看到评论

All of the operations perform as could be expected for a doubly-linked list. Operations that index into the list will traverse the list from the beginning or the end, whichever is closer to the specified index.

而JavaDoc for ArrayList具有相应的

Resizable-array implementation of the List interface. Implements all optional list operations, and permits all elements, including null. In addition to implementing the List interface, this class provides methods to manipulate the size of the array that is used internally to store the list. (This class is roughly equivalent to Vector, except that it is unsynchronized.)

The size, isEmpty, get, set, iterator, and listIterator operations run in constant time. The add operation runs in amortized constant time, that is, adding n elements requires O(n) time. All of the other operations run in linear time (roughly speaking). The constant factor is low compared to that for the LinkedList implementation.

一个标题为"Java集合框架Big-O概要"的相关问题有一个指向这个资源的答案,"Java集合JDK6",您可能会发现它是有用的。


迭代带有查找偏移量的列表,如i,类似于painter算法shlemiel。

Shlemiel gets a job as a street painter, painting the dotted lines
down the middle of the road. On the first day he takes a can of paint
out to the road and finishes 300 yards of the road."That's pretty
good!" says his boss,"you're a fast worker!" and pays him a kopeck.

The next day Shlemiel only gets 150 yards done."Well, that's not
nearly as good as yesterday, but you're still a fast worker. 150 yards
is respectable," and pays him a kopeck.

The next day Shlemiel paints 30 yards of the road."Only 30!" shouts
his boss."That's unacceptable! On the first day you did ten times
that much work! What's going on?"

"I can't help it," says Shlemiel."Every day I get farther and farther
away from the paint can!"

来源。

这个小故事可能使我们更容易理解内部发生的事情以及为什么它效率如此低。


虽然公认的答案肯定是正确的,但我能指出一个小缺点吗?引用Tudor:

Now, going to the other implementation of List which is ArrayList,
that one is backed by a simple array. In that case both of the above
traversals are equivalent, since an array is contiguous so it allows
random jumps to arbitrary positions.

这不是完全正确的。事实是,

With an ArrayList, a hand-written counted loop is about 3x faster

资料来源:谷歌安卓文档,性能设计

注意,手写循环引用了索引迭代。我怀疑它是因为迭代器与增强for循环一起使用。它在由连续数组支持的结构中产生轻微的性能损失。我也怀疑向量类可能是这样的。

我的规则是,尽可能使用增强的for循环,如果您真正关心性能,那么只对数组列表或向量使用索引迭代。在大多数情况下,您甚至可以忽略这一点——编译器可能正在后台优化这一点。

我只想指出,在Android的开发环境中,数组列表的遍历不一定是等价的。只是思考的食物。


为了找到LinkedList的第i个元素,实现将遍历到第i个元素的所有元素。

所以

1
2
3
for(int i = 0; i < list.length ; i++ ) {
    Object something = list.get(i); //Slow for LinkedList
}