Why are Swift iterators slower than array building?
这与这个问题有关,假设使用生成器(迭代器)遍历嵌套数组对于遍历元素是最佳的,只要您不需要存储结果,而使用重复的数组连接是最好的,如果您只想展平数组。
但是,我决定做一些测试,并实现这个函数(它将包含
令人难以置信的是,它甚至比同一个程序的Python实现慢5-70%,这在较小的输入下会恶化。Swift是用
这是三个测试案例1。小输入,混合;2.大输入,以
1 2 3 4 | let array1: [Any] = [Array(1...100), Array(101...105), 106, Array(107...111), 112, 113, 114, Array(115...125)] let array2: [Any] = Array(repeating: Array(1...5), count: 2000) let array3: [Any] = Array(repeating: 31, count: 10000) |
Python
1 2 3 4 | A1 = [list(range(1, 101)), list(range(101, 106)), 106, list(range(107, 112)), 112, 113, 114, list(range(115, 126))] A2 = list(range(1, 6)) * 2000 A3 = [31] * 10000 |
生成器和数组生成器:
迅捷1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | func chain(_ segments: [Any]) -> AnyIterator<Int>{ var i = 0 var j = 0 return AnyIterator<Int> { while i < segments.count { switch segments[i] { case let e as Int: i += 1 return e case let E as [Int]: if j < E.count { let val = E[j] j += 1 return val } j = 0 i += 1 default: return nil } } return nil } } func flatten_array(_ segments: [Any]) -> [Int] { var result = [Int]() for segment in segments { switch segment { case let segment as Int: result.append(segment) case let segment as [Int]: result.append(contentsOf: segment) default: break } } return result } |
Python
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | def chain(L): for i in L: if type(i) is int: yield i elif type(i) is list: yield from i def flatten_list(L): result = [] for i in L: if type(i) is int: result.append(i) elif type(i) is list: result.extend(i) return result |
以及基准结果(第一个测试用例上的100000个循环,其他测试用例上的1000个循环):
迅捷1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | test case 1 (small mixed input) Filling an array : 0.068221092224121094 s Filling an array, and looping through it : 0.074559926986694336 s Looping through a generator : 1.5902719497680664 s * Materializing the generator to an array : 1.759943962097168 s * test case 2 (large input, [Int] s) Filling an array : 0.20634698867797852 s Filling an array, and looping through it : 0.21031379699707031 s Looping through a generator : 1.3505551815032959 s * Materializing the generator to an array : 1.4733860492706299 s * test case 3 (large input, Int s) Filling an array : 0.27392101287841797 s Filling an array, and looping through it : 0.27670192718505859 s Looping through a generator : 0.85304021835327148 s Materializing the generator to an array : 1.0027849674224854 s * |
Python
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | test case 1 (small mixed input) Filling an array : 0.1622014045715332 s Filling an array, and looping through it : 0.4312894344329834 s Looping through a generator : 0.6839139461517334 s Materializing the generator to an array : 0.5300459861755371 s test case 2 (large input, [int] s) Filling an array : 1.029205083847046 s Filling an array, and looping through it : 1.2195289134979248 s Looping through a generator : 1.0876803398132324 s Materializing the generator to an array : 0.8958714008331299 s test case 3 (large input, int s) Filling an array : 1.0181667804718018 s Filling an array, and looping through it : 1.244570255279541 s Looping through a generator : 1.1220412254333496 s Materializing the generator to an array : 0.9486079216003418 s |
显然,Swift非常擅长构建阵列。但为什么它的生成器如此之慢,在某些情况下甚至比Python的还要慢?(以表中的
如果这真的是语言固有的,它有一些有趣的含义。例如,常识(对我来说,作为一个Python程序员来说)是这样的:如果我们试图合成一个不可变的对象(比如字符串),我们应该首先将源代码提供给一个生成函数来展开它,然后将输出交给一个处理单个浅序列的
构建一个完整的新数组,然后迭代它是否比对原始数组进行延迟迭代更快?为什么?
(可能相关的javascript问题)
编辑测试代码如下:
迅捷1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | func time(test_array: [Any], cycles: Int = 1000000) -> (array_iterate: Double, array_store : Double, generate_iterate: Double, generate_store: Double) { func start() -> Double { return Date().timeIntervalSince1970 } func lap(_ t0: Double) -> Double { return Date().timeIntervalSince1970 - t0 } var t0 = start() for _ in 0..<cycles { for e in flatten_array(test_array) { e + 1 } } let ΔE1 = lap(t0) t0 = start() for _ in 0..<cycles { let array: [Int] = flatten_array(test_array) } let ΔE2 = lap(t0) t0 = start() for _ in 0..<cycles { let G = chain(test_array) while let g = G.next() { g + 1 } } let ΔG1 = lap(t0) t0 = start() for _ in 0..<cycles { let array: [Int] = Array(chain(test_array)) } let ΔG2 = lap(t0) return (ΔE1, ΔE2, ΔG1, ΔG2) } print(time(test_array: array1, cycles: 100000)) print(time(test_array: array2, cycles: 1000)) print(time(test_array: array3, cycles: 1000)) |
Python
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | def time_f(test_array, cycles = 1000000): lap = lambda t0: time() - t0 t0 = time() for _ in range(cycles): for e in flatten_list(test_array): e + 1 ΔE1 = lap(t0) t0 = time() for _ in range(cycles): array = flatten_list(test_array) ΔE2 = lap(t0) t0 = time() for _ in range(cycles): for g in chain(test_array): g + 1 ΔG1 = lap(t0) t0 = time() for _ in range(cycles): array = list(chain(test_array)) ΔG2 = lap(t0) return ΔE1, ΔE2, ΔG1, ΔG2 print(time_f(A1, cycles=100000)) print(time_f(A3, cycles=1000)) print(time_f(A2, cycles=1000)) |
你问"为什么它的(swift)生成器如此慢,在某些情况下甚至比Python的还要慢?"
我的答案是,我不认为他们几乎是慢,因为你的结果可能表明。特别是,我将尝试演示通过迭代器的循环应该比为所有测试用例构造数组更快。
在早期的工作中(参见http://LeMiel.Me/Blog/2016/09/22/SWIFT与JAVA的BITSSET性能测试)相关的博客文章,我发现SWIFT迭代器比在JAVA中的等价物在位集类上工作的速度快一半。这不是很好,但是Java在这方面非常有效。与此同时,Go做得更糟。我向您提交的swift迭代器可能不是理想的高效迭代器,但它们可能在原始C代码可能的两个因素之内。而性能差距可能与swift中的函数内联不足有关。
我看到你用的是
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | public struct FastFlattenIterator: IteratorProtocol { let segments: [Any] var i = 0 // top-level index var j = 0 // second-level index var jmax = 0 // essentially, this is currentarray.count, but we buffer it var currentarray : [Int]! // quick reference to an int array to be flatten init(_ segments: [Any]) { self.segments = segments } public mutating func next() -> Int? { if j > 0 { // we handle the case where we iterate within an array separately let val = currentarray[j] j += 1 if j == jmax { j = 0 i += 1 } return val } while i < segments.count { switch segments[i] { case let e as Int: // found an integer value i += 1 return e case let E as [Int]: // first encounter with an array jmax = E.count currentarray = E if jmax > 0 { j = 1 return E[0] } i += 1 default: return nil } } return nil } } |
有了这门课,我得到了下面的数字。对于每个测试用例,前四种方法取自代码示例,而后两种方法(快速迭代器)是使用新结构构建的。注意,"通过快速迭代器循环"总是最快的。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | test case 1 (small mixed input) Filling an array : 0.0073099999999999997 ms Filling an array, and looping through it : 0.0069870000000000002 ms Looping through a generator : 0.18385799999999999 ms Materializing the generator to an array : 0.18745700000000001 ms Looping through a fast iterator : 0.005372 ms Materializing the fast iterator : 0.015883999999999999 ms test case 2 (large input, [Int] s) Filling an array : 2.125931 ms Filling an array, and looping through it : 2.1169820000000001 ms Looping through a generator : 15.064767 ms Materializing the generator to an array : 15.45152 ms Looping through a fast iterator : 1.572919 ms Materializing the fast iterator : 1.964912 ms test case 3 (large input, Int s) Filling an array : 2.9140269999999999 ms Filling an array, and looping through it : 2.9064290000000002 ms Looping through a generator : 9.8297640000000008 ms Materializing the generator to an array : 9.8297640000000008 ms Looping through a fast iterator : 1.978038 ms Materializing the fast iterator : 2.2565339999999998 ms |
您可以在github上找到我的完整代码示例:https://github.com/lemire/code-used-on-daniel-lemire-s-blog/tree/master/extra/swift/iterators