Performance surprise with “as” and nullable types
我只是对C的第4章进行了深入的修订,该章讨论了可以为空的类型,我添加了一节关于使用"as"运算符的内容,它允许您编写:
1 2 3 4 5 6 | object o = ...; int? x = o as int?; if (x.HasValue) { ... // Use x.Value in here } |
我认为这真的很好,而且它可以提高C 1的性能,使用"is"后面跟着一个cast-毕竟,这样我们只需要请求一次动态类型检查,然后再进行一次简单的值检查。
然而,情况似乎并非如此。我在下面包含了一个示例测试应用程序,它基本上是对一个对象数组中的所有整数求和的-但是这个数组包含很多空引用、字符串引用以及装箱的整数。基准测试衡量您必须在C 1中使用的代码,使用"as"运算符的代码,以及仅仅为了启动LINQ解决方案。令我惊讶的是,在这种情况下,C 1代码的速度快了20倍,甚至连LINQ代码(考虑到涉及的迭代器,我本以为会慢一点)也比"as"代码快。
对于可以为空的类型,
结果:
Cast: 10000000 : 121
As: 10000000 : 2211
LINQ: 10000000 : 2143
代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 | using System; using System.Diagnostics; using System.Linq; class Test { const int Size = 30000000; static void Main() { object[] values = new object[Size]; for (int i = 0; i < Size - 2; i += 3) { values[i] = null; values[i+1] =""; values[i+2] = 1; } FindSumWithCast(values); FindSumWithAs(values); FindSumWithLinq(values); } static void FindSumWithCast(object[] values) { Stopwatch sw = Stopwatch.StartNew(); int sum = 0; foreach (object o in values) { if (o is int) { int x = (int) o; sum += x; } } sw.Stop(); Console.WriteLine("Cast: {0} : {1}", sum, (long) sw.ElapsedMilliseconds); } static void FindSumWithAs(object[] values) { Stopwatch sw = Stopwatch.StartNew(); int sum = 0; foreach (object o in values) { int? x = o as int?; if (x.HasValue) { sum += x.Value; } } sw.Stop(); Console.WriteLine("As: {0} : {1}", sum, (long) sw.ElapsedMilliseconds); } static void FindSumWithLinq(object[] values) { Stopwatch sw = Stopwatch.StartNew(); int sum = values.OfType<int>().Sum(); sw.Stop(); Console.WriteLine("LINQ: {0} : {1}", sum, (long) sw.ElapsedMilliseconds); } } |
显然,JIT编译器可以为第一种情况生成的机器代码效率更高。一个真正有帮助的规则是,对象只能取消绑定到与装箱值具有相同类型的变量。这允许JIT编译器生成非常有效的代码,不需要考虑值转换。
IS操作员测试很容易,只需检查对象是否不为空,是否为预期类型,只接受一些机器代码指令。转换也很容易,JIT编译器知道对象中值位的位置,并直接使用它们。不会进行复制或转换,所有机器代码都是内联的,只需要十几条指令。当拳击很常见时,这需要在.NET 1.0中非常有效。
铸造到国际?需要更多的工作。装箱整数的值表示与
扩展方法linq of type()还使用了is运算符和cast。但是,这是一个到泛型类型的强制转换。JIT编译器生成对助手函数jit_unbox()的调用,该函数可以执行对任意值类型的强制转换。我没有一个很好的解释,为什么它会像《江户记》1(0)中的演员一样慢,因为需要更少的工作。我怀疑ngen.exe可能会在这里引起麻烦。
在我看来,
1 |
到
1 |
这也大大降低了执行速度。我能看到的唯一不同是
1 | isinst [mscorlib]System.Int32 |
变为
1 | isinst valuetype [mscorlib]System.Nullable`1<int32> |
最初,这是对汉斯·帕桑特优秀答案的评论,但它太长了,所以我想在这里添加一些内容:
首先,c
以下是
Format: isinst typeTok
typeTok is a metadata token (a
typeref ,typedef ortypespec ), indicating the desired class.If typeTok is a non-nullable value type or a generic parameter type it is interpreted as"boxed" typeTok.
If typeTok is a nullable type,
Nullable , it is interpreted as"boxed"T
最重要的是:
If the actual type (not the verifier tracked type) of obj is verifier-assignable-to the type typeTok then
isinst succeeds and obj (as result) is returned unchanged while verification tracks its type as typeTok. Unlike coercions (§1.6) and conversions (§3.27),isinst never changes the actual type of an object and preserves object identity (see Partition I).
因此,在本例中,性能杀手不是
为什么会这样?
用标准中的一些信息备份Hans的发现,如下所示:
(ECMA 335第三部分,4.33):
When applied to the boxed form of a value type, the
unbox.any instruction extracts the value contained within obj (of typeO ). (It is equivalent tounbox followed byldobj .) When applied to a reference type, theunbox.any instruction has the same effect ascastclass typeTok.
(ECMA 335第三部分,4.32):
Typically,
unbox simply computes the address of the value type that is already present inside of the boxed object. This approach is not possible when unboxing nullable value types. BecauseNullable values are converted to boxedTs during the box operation, an implementation often must manufacture a newNullable on the heap and compute the address to the newly allocated object.
有趣的是,我通过
我要爱你。另一个有趣的例子是,即使jit点(并删除)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | using System; using System.Diagnostics; static class Program { static void Main() { // JIT TestUnrestricted<int>(1,5); TestUnrestricted<string>("abc",5); TestUnrestricted<int?>(1,5); TestNullable<int>(1, 5); const int LOOP = 100000000; Console.WriteLine(TestUnrestricted<int>(1, LOOP)); Console.WriteLine(TestUnrestricted<string>("abc", LOOP)); Console.WriteLine(TestUnrestricted<int?>(1, LOOP)); Console.WriteLine(TestNullable<int>(1, LOOP)); } static long TestUnrestricted<T>(T x, int loop) { Stopwatch watch = Stopwatch.StartNew(); int count = 0; for (int i = 0; i < loop; i++) { if (x != null) count++; } watch.Stop(); return watch.ElapsedMilliseconds; } static long TestNullable<T>(T? x, int loop) where T : struct { Stopwatch watch = Stopwatch.StartNew(); int count = 0; for (int i = 0; i < loop; i++) { if (x != null) count++; } watch.Stop(); return watch.ElapsedMilliseconds; } } |
这是上面findsumwithasandhas的结果:alt text http://www.freeimagehosting.net/uploads/9e3c0bfb75.png
这是findsumwithcast的结果:alt text http://www.freeimagehosting.net/uploads/ce8a5a3934.png
调查结果:
使用
as ,首先测试对象是否是int32的实例;在引擎盖下使用isinst Int32 (类似于手写代码:如果(o是int))。使用as 也无条件地解除了对象的绑定。它是一个真正的性能杀手调用一个属性(它仍然是引擎盖下的一个函数),il_使用cast,首先测试对象是否为
int if (o is int) ;在引擎盖下使用isinst Int32 。如果它是int的一个实例,那么您可以安全地解除对该值的绑定,il d
简而言之,这是使用
1 2 3 4 5 6 | int? x; (x.HasValue, x.Value) = (o isinst Int32, o unbox Int32) if (x.HasValue) sum += x.Value; |
这是使用CAST方法的伪代码:
1 2 | if (o isinst Int32) sum += (o unbox Int32) |
所以cast(
进一步分析:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 | using System; using System.Diagnostics; class Program { const int Size = 30000000; static void Main(string[] args) { object[] values = new object[Size]; for (int i = 0; i < Size - 2; i += 3) { values[i] = null; values[i + 1] =""; values[i + 2] = 1; } FindSumWithIsThenCast(values); FindSumWithAsThenHasThenValue(values); FindSumWithAsThenHasThenCast(values); FindSumWithManualAs(values); FindSumWithAsThenManualHasThenValue(values); Console.ReadLine(); } static void FindSumWithIsThenCast(object[] values) { Stopwatch sw = Stopwatch.StartNew(); int sum = 0; foreach (object o in values) { if (o is int) { int x = (int)o; sum += x; } } sw.Stop(); Console.WriteLine("Is then Cast: {0} : {1}", sum, (long)sw.ElapsedMilliseconds); } static void FindSumWithAsThenHasThenValue(object[] values) { Stopwatch sw = Stopwatch.StartNew(); int sum = 0; foreach (object o in values) { int? x = o as int?; if (x.HasValue) { sum += x.Value; } } sw.Stop(); Console.WriteLine("As then Has then Value: {0} : {1}", sum, (long)sw.ElapsedMilliseconds); } static void FindSumWithAsThenHasThenCast(object[] values) { Stopwatch sw = Stopwatch.StartNew(); int sum = 0; foreach (object o in values) { int? x = o as int?; if (x.HasValue) { sum += (int)o; } } sw.Stop(); Console.WriteLine("As then Has then Cast: {0} : {1}", sum, (long)sw.ElapsedMilliseconds); } static void FindSumWithManualAs(object[] values) { Stopwatch sw = Stopwatch.StartNew(); int sum = 0; foreach (object o in values) { bool hasValue = o is int; int x = hasValue ? (int)o : 0; if (hasValue) { sum += x; } } sw.Stop(); Console.WriteLine("Manual As: {0} : {1}", sum, (long)sw.ElapsedMilliseconds); } static void FindSumWithAsThenManualHasThenValue(object[] values) { Stopwatch sw = Stopwatch.StartNew(); int sum = 0; foreach (object o in values) { int? x = o as int?; if (o is int) { sum += x.Value; } } sw.Stop(); Console.WriteLine("As then Manual Has then Value: {0} : {1}", sum, (long)sw.ElapsedMilliseconds); } } |
输出:
1 2 3 4 5 | Is then Cast: 10000000 : 303 As then Has then Value: 10000000 : 3524 As then Has then Cast: 10000000 : 3272 Manual As: 10000000 : 395 As then Manual Has then Value: 10000000 : 3282 |
从这些数字我们能推断出什么?
- 首先,IS-THEN-CAST方法明显快于AS方法。303比3524
- 其次,.value比casting稍慢。3524比3272
- 第三,.hasValue比使用手动HAS(即使用IS)稍慢。3524比3282
- 第四,在模拟as和真实as方法之间进行苹果对苹果的比较(即同时分配模拟hasValue和转换模拟值),我们可以看到模拟as仍然明显快于真实as。395比3524
- 最后,根据第一和第四个结论,AS有问题实施^^
为了使这个答案保持最新,值得一提的是,这个页面上的大多数讨论现在都没有与C 7.1和.NET 4.7一起讨论,后者支持一种纤细的语法,也可以生成最好的IL代码。
操作的原始示例…
1 2 3 4 5 6 | object o = ...; int? x = o as int?; if (x.HasValue) { // ...use x.Value in here } |
变得简单…
1 2 3 4 |
我发现新语法的一个常见用法是当您编写一个.NET值类型(即C中的
1 |
nbsp;
附录:本文分别给出了上述答案中前两个示例函数的
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | // static void test1(Object o, ref int y) // { // int? x = o as int?; // if (x.HasValue) // y = x.Value; // } [0] valuetype [mscorlib]Nullable`1<int32> x ldarg.0 isinst [mscorlib]Nullable`1<int32> unbox.any [mscorlib]Nullable`1<int32> stloc.0 ldloca.s x call instance bool [mscorlib]Nullable`1<int32>::get_HasValue() brfalse.s L_001e ldarg.1 ldloca.s x call instance !0 [mscorlib]Nullable`1<int32>::get_Value() stind.i4 L_001e: ret |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | // static void test2(Object o, ref int y) // { // if (o is int x) // y = x; // } [0] int32 x, [1] object obj2 ldarg.0 stloc.1 ldloc.1 isinst int32 ldnull cgt.un dup brtrue.s L_0011 ldc.i4.0 br.s L_0017 L_0011: ldloc.1 unbox.any int32 L_0017: stloc.0 brfalse.s L_001d ldarg.1 ldloc.0 stind.i4 L_001d: ret |
如需进一步测试,证实我关于新C 7语法的性能超过了以前可用的选项,请参阅此处(尤其是示例"d")。
我尝试了精确的类型检查构造
然而
在我看来,typeof结构是精确类型检查的最快方法,因为它使用了runtimetypehandle。由于本例中的确切类型与nullable不匹配,我的猜测是,
老实说:你的
我没有时间尝试,但您可能需要:
1 2 3 | foreach (object o in values) { int? x = o as int?; |
作为
1 2 3 4 | int? x; foreach (object o in values) { x = o as int?; |
您每次都在创建一个新对象,这不能完全解释问题,但可能会有所贡献。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 | using System; using System.Diagnostics; using System.Linq; class Test { const int Size = 30000000; static void Main() { object[] values = new object[Size]; for (int i = 0; i < Size - 2; i += 3) { values[i] = null; values[i + 1] =""; values[i + 2] = 1; } FindSumWithCast(values); FindSumWithAsAndHas(values); FindSumWithAsAndIs(values); FindSumWithIsThenAs(values); FindSumWithIsThenConvert(values); FindSumWithLinq(values); Console.ReadLine(); } static void FindSumWithCast(object[] values) { Stopwatch sw = Stopwatch.StartNew(); int sum = 0; foreach (object o in values) { if (o is int) { int x = (int)o; sum += x; } } sw.Stop(); Console.WriteLine("Cast: {0} : {1}", sum, (long)sw.ElapsedMilliseconds); } static void FindSumWithAsAndHas(object[] values) { Stopwatch sw = Stopwatch.StartNew(); int sum = 0; foreach (object o in values) { int? x = o as int?; if (x.HasValue) { sum += x.Value; } } sw.Stop(); Console.WriteLine("As and Has: {0} : {1}", sum, (long)sw.ElapsedMilliseconds); } static void FindSumWithAsAndIs(object[] values) { Stopwatch sw = Stopwatch.StartNew(); int sum = 0; foreach (object o in values) { int? x = o as int?; if (o is int) { sum += x.Value; } } sw.Stop(); Console.WriteLine("As and Is: {0} : {1}", sum, (long)sw.ElapsedMilliseconds); } static void FindSumWithIsThenAs(object[] values) { // Apple-to-apple comparison with Cast routine above. // Using the similar steps in Cast routine above, // the AS here cannot be slower than Linq. Stopwatch sw = Stopwatch.StartNew(); int sum = 0; foreach (object o in values) { if (o is int) { int? x = o as int?; sum += x.Value; } } sw.Stop(); Console.WriteLine("Is then As: {0} : {1}", sum, (long)sw.ElapsedMilliseconds); } static void FindSumWithIsThenConvert(object[] values) { Stopwatch sw = Stopwatch.StartNew(); int sum = 0; foreach (object o in values) { if (o is int) { int x = Convert.ToInt32(o); sum += x; } } sw.Stop(); Console.WriteLine("Is then Convert: {0} : {1}", sum, (long)sw.ElapsedMilliseconds); } static void FindSumWithLinq(object[] values) { Stopwatch sw = Stopwatch.StartNew(); int sum = values.OfType<int>().Sum(); sw.Stop(); Console.WriteLine("LINQ: {0} : {1}", sum, (long)sw.ElapsedMilliseconds); } } |
输出:
1 2 3 4 5 6 |
[编辑:2010-06-19]
注意:以前的测试是在VS中完成的,配置调试,使用VS2009,使用核心I7(公司开发机器)。
下面是在我的机器上使用Core2 Duo,使用VS2010完成的
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | Inside VS, Configuration: Debug Cast: 10000000 : 309 As and Has: 10000000 : 3322 As and Is: 10000000 : 3249 Is then As: 10000000 : 1926 Is then Convert: 10000000 : 410 LINQ: 10000000 : 2018 Outside VS, Configuration: Debug Cast: 10000000 : 303 As and Has: 10000000 : 3314 As and Is: 10000000 : 3230 Is then As: 10000000 : 1942 Is then Convert: 10000000 : 418 LINQ: 10000000 : 1944 Inside VS, Configuration: Release Cast: 10000000 : 305 As and Has: 10000000 : 3327 As and Is: 10000000 : 3265 Is then As: 10000000 : 1942 Is then Convert: 10000000 : 414 LINQ: 10000000 : 1932 Outside VS, Configuration: Release Cast: 10000000 : 301 As and Has: 10000000 : 3274 As and Is: 10000000 : 3240 Is then As: 10000000 : 1904 Is then Convert: 10000000 : 414 LINQ: 10000000 : 1936 |