关于C#：linq计数与ilist计数

Linq count vs IList count

如果我有以下来自某个存储库的IEnumerable列表。

1	IEnumerable<SomeObject> items = _someRepo.GetAll();

什么更快？

1	items.Count(); // Using Linq on the IEnumerable interface.

或

1
2
3

List<SomeObject> temp = items.ToList<SomeObject>(); // Cast as a List

temp.Count(); // Do a count on a list

LinqCount()比将IEnumerable投射到List然后执行Count()快还是慢？

更新：将问题稍微改进到更现实的场景。

相关讨论

直接打电话给Count是更好的选择。

Enumerable.Count内置了一些性能改进，使其在不枚举整个集合的情况下返回：

1
2
3
4
5
6
7
8
9
10
11
12
13
14

public static int Count<TSource>(this IEnumerable<TSource> source) {
if (source == null) throw Error.ArgumentNull("source");
ICollection<TSource> collectionoft = source as ICollection<TSource>;
if (collectionoft != null) return collectionoft.Count;
ICollection collection = source as ICollection;
if (collection != null) return collection.Count;
int count = 0;
using (IEnumerator<TSource> e = source.GetEnumerator()) {
checked {
while (e.MoveNext()) count++;
}
}
return count;
}

ToList()使用类似的优化，烘焙到List(IEnumerable source)构造函数中：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

public List(IEnumerable<T> collection) {
if (collection==null)
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.collection);
Contract.EndContractBlock();

ICollection<T> c = collection as ICollection<T>;
if( c != null) {
int count = c.Count;
if (count == 0)
{
_items = _emptyArray;
}
else {
_items = new T[count];
c.CopyTo(_items, 0);
_size = count;
}
}
else {
_size = 0;
_items = _emptyArray;
// This enumerable could be empty. Let Add allocate a new array, if needed.
// Note it will also go to _defaultCapacity first, not 1, then 2, etc.

using(IEnumerator<T> en = collection.GetEnumerator()) {
while(en.MoveNext()) {
Add(en.Current);
}
}
}
}

但是正如您所看到的，它只使用通用的ICollection，所以如果您的集合实现了ICollection，而不是直接调用Count()的通用版本，那么速度会快得多。

不首先调用ToList也可以节省您分配新List实例的费用，而不是过于昂贵，但最好尽可能避免不必要的分配。

相关讨论

一个非常简单的linqpad测试表明，调用IEnumerable.Count()比创建一个列表集合并获取计数更快，更不用说内存效率更高(如其他答案中所述)，而且在重新访问已经枚举的集合时也更快。

我从IEnumerable调用count()的平均时间是~4个滴答，而创建新列表以获取计数的平均时间是~10 K。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

void Main()
{
IEnumerable<string> ienumerable = GetStrings();
var test1 = new Stopwatch();
test1.Start();
var count1 = ienumerable.Count();
test1.Stop();
test1.ElapsedTicks.Dump();

var test2 = new Stopwatch();
test2.Start();
var count2 = ienumerable.ToList().Count;
test2.Stop();
test2.ElapsedTicks.Dump();

var test3 = new Stopwatch();
test3.Start();
var count3 = ienumerable.Count();
test3.Stop();
test3.ElapsedTicks.Dump();
}

public IEnumerable<string> GetStrings()
{
var testString ="test";
var strings = new List<string>();
for (int i = 0; i < 500000; i++)
{
strings.Add(testString);
}

return strings;
}

在后一种情况下，将产生从现有集合创建新集合所需的循环(在hood下必须迭代集合)，然后从集合中提取Count属性。因此，可枚举优化会获胜并更快地返回计数值。

在第三次测试运行中，平均计时周期下降到~2，因为它立即返回以前看到的计数(如下所示)。

1
2
3
4

IColllection<TSource> collectionoft = source as ICollection<TSource>;
if (collectionoft != null) return collectionoft.Count;
ICollection collection = source as ICollection;
if (collection != null) return collection.Count;

然而，这里真正的成本不是CPU周期，而是内存消耗。这是你应该更关心的。

最后，作为警告，在枚举集合时，请注意不要使用count()。这样做将重新枚举集合，导致可能的冲突。如果在迭代集合时需要使用count，那么正确的方法是使用.ToList()创建一个新的列表，并迭代该列表，引用Count。

任何一个版本都需要(在一般情况下)完全迭代您的IEnumerable。

在某些情况下，支持类型提供了一种直接确定可用于O(1)性能的计数的机制。详情请参见@marcin's answer。

调用tolist()的版本将有额外的CPU开销，尽管非常小，而且可能很难测量。它还将分配不会被分配的内存。如果你的人数高，那将是更大的担忧。