Fastest way to list all primes below N
这是我能想到的最好的算法。
1 2 3 4 5 6 7 8 9 10 11 | def get_primes(n): numbers = set(range(n, 1, -1)) primes = [] while numbers: p = numbers.pop() primes.append(p) numbers.difference_update(set(range(p*2, n+1, p))) return primes >>> timeit.Timer(stmt='get_primes.get_primes(1000000)', setup='import get_primes').timeit(1) 1.1499958793645562 |
能再快一点吗?
此代码有一个缺陷:由于
1 2 3 4 5 6 7 | >>> sum(get_primes(2000000)) 142913828922L #That's the correct sum of all numbers below 2 million >>> 529 in get_primes(1000) False >>> 529 in get_primes(530) True |
警告:由于硬件或python的版本。
下面是一个比较多个实现的脚本:
- Ambi ou筛网平原,
- ReffyPrimes,
- RWHY引物1,
- RWH引物2,
- 西维法特金
- 荧光素类,
- 孙达姆3号
- 西维耶WELY30
- Ambi筛(需要numpy)
- primesfrom3to(需要numpy)
- primesfrom2to(需要numpy)
非常感谢斯蒂芬把筛子轮30带到我的视线中。罗伯特·威廉·汉克斯(Robert William Hanks)获得了PrimesFrom2to、PrimesFrom3to、Rwh-Primes、Rwh-Primes1和Rwh-Primes2的荣誉。
在用psyco测试的普通python方法中,n=1000000,RWh-Primes1测试最快。
1 2 3 4 5 6 7 8 9 10 11 12 | +---------------------+-------+ | Method | ms | +---------------------+-------+ | rwh_primes1 | 43.0 | | sieveOfAtkin | 46.4 | | rwh_primes | 57.4 | | sieve_wheel_30 | 63.0 | | rwh_primes2 | 67.8 | | sieveOfEratosthenes | 147.0 | | ambi_sieve_plain | 152.0 | | sundaram3 | 194.0 | +---------------------+-------+ |
在没有psyco的情况下测试的普通python方法中,n=1000000,RWh-Primes2是最快的。
1 2 3 4 5 6 7 8 9 10 11 12 | +---------------------+-------+ | Method | ms | +---------------------+-------+ | rwh_primes2 | 68.1 | | rwh_primes1 | 93.7 | | rwh_primes | 94.6 | | sieve_wheel_30 | 97.4 | | sieveOfEratosthenes | 178.0 | | ambi_sieve_plain | 286.0 | | sieveOfAtkin | 314.0 | | sundaram3 | 416.0 | +---------------------+-------+ |
在所有测试方法中,允许numpy,n=1000000,PrimesFrom2to是测试最快的。
1 2 3 4 5 6 7 | +---------------------+-------+ | Method | ms | +---------------------+-------+ | primesfrom2to | 15.9 | | primesfrom3to | 18.4 | | ambi_sieve | 29.3 | +---------------------+-------+ |
使用以下命令测量计时:
1 | python -mtimeit -s"import primes""primes.{method}(1000000)" |
用每个方法名替换
PY:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 | #!/usr/bin/env python import psyco; psyco.full() from math import sqrt, ceil import numpy as np def rwh_primes(n): # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188 """ Returns a list of primes < n""" sieve = [True] * n for i in xrange(3,int(n**0.5)+1,2): if sieve[i]: sieve[i*i::2*i]=[False]*((n-i*i-1)/(2*i)+1) return [2] + [i for i in xrange(3,n,2) if sieve[i]] def rwh_primes1(n): # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188 """ Returns a list of primes < n""" sieve = [True] * (n/2) for i in xrange(3,int(n**0.5)+1,2): if sieve[i/2]: sieve[i*i/2::i] = [False] * ((n-i*i-1)/(2*i)+1) return [2] + [2*i+1 for i in xrange(1,n/2) if sieve[i]] def rwh_primes2(n): # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188 """ Input n>=6, Returns a list of primes, 2 <= p < n""" correction = (n%6>1) n = {0:n,1:n-1,2:n+4,3:n+3,4:n+2,5:n+1}[n%6] sieve = [True] * (n/3) sieve[0] = False for i in xrange(int(n**0.5)/3+1): if sieve[i]: k=3*i+1|1 sieve[ ((k*k)/3) ::2*k]=[False]*((n/6-(k*k)/6-1)/k+1) sieve[(k*k+4*k-2*k*(i&1))/3::2*k]=[False]*((n/6-(k*k+4*k-2*k*(i&1))/6-1)/k+1) return [2,3] + [3*i+1|1 for i in xrange(1,n/3-correction) if sieve[i]] def sieve_wheel_30(N): # http://zerovolt.com/?p=88 ''' Returns a list of primes <= N using wheel criterion 2*3*5 = 30 Copyright 2009 by zerovolt.com This code is free for non-commercial purposes, in which case you can just leave this comment as a credit for my work. If you need this code for commercial purposes, please contact me by sending an email to: info [at] zerovolt [dot] com.''' __smallp = ( 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599, 601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997) wheel = (2, 3, 5) const = 30 if N < 2: return [] if N <= const: pos = 0 while __smallp[pos] <= N: pos += 1 return list(__smallp[:pos]) # make the offsets list offsets = (7, 11, 13, 17, 19, 23, 29, 1) # prepare the list p = [2, 3, 5] dim = 2 + N // const tk1 = [True] * dim tk7 = [True] * dim tk11 = [True] * dim tk13 = [True] * dim tk17 = [True] * dim tk19 = [True] * dim tk23 = [True] * dim tk29 = [True] * dim tk1[0] = False # help dictionary d # d[a , b] = c ==> if I want to find the smallest useful multiple of (30*pos)+a # on tkc, then I need the index given by the product of [(30*pos)+a][(30*pos)+b] # in general. If b < a, I need [(30*pos)+a][(30*(pos+1))+b] d = {} for x in offsets: for y in offsets: res = (x*y) % const if res in offsets: d[(x, res)] = y # another help dictionary: gives tkx calling tmptk[x] tmptk = {1:tk1, 7:tk7, 11:tk11, 13:tk13, 17:tk17, 19:tk19, 23:tk23, 29:tk29} pos, prime, lastadded, stop = 0, 0, 0, int(ceil(sqrt(N))) # inner functions definition def del_mult(tk, start, step): for k in xrange(start, len(tk), step): tk[k] = False # end of inner functions definition cpos = const * pos while prime < stop: # 30k + 7 if tk7[pos]: prime = cpos + 7 p.append(prime) lastadded = 7 for off in offsets: tmp = d[(7, off)] start = (pos + prime) if off == 7 else (prime * (const * (pos + 1 if tmp < 7 else 0) + tmp) )//const del_mult(tmptk[off], start, prime) # 30k + 11 if tk11[pos]: prime = cpos + 11 p.append(prime) lastadded = 11 for off in offsets: tmp = d[(11, off)] start = (pos + prime) if off == 11 else (prime * (const * (pos + 1 if tmp < 11 else 0) + tmp) )//const del_mult(tmptk[off], start, prime) # 30k + 13 if tk13[pos]: prime = cpos + 13 p.append(prime) lastadded = 13 for off in offsets: tmp = d[(13, off)] start = (pos + prime) if off == 13 else (prime * (const * (pos + 1 if tmp < 13 else 0) + tmp) )//const del_mult(tmptk[off], start, prime) # 30k + 17 if tk17[pos]: prime = cpos + 17 p.append(prime) lastadded = 17 for off in offsets: tmp = d[(17, off)] start = (pos + prime) if off == 17 else (prime * (const * (pos + 1 if tmp < 17 else 0) + tmp) )//const del_mult(tmptk[off], start, prime) # 30k + 19 if tk19[pos]: prime = cpos + 19 p.append(prime) lastadded = 19 for off in offsets: tmp = d[(19, off)] start = (pos + prime) if off == 19 else (prime * (const * (pos + 1 if tmp < 19 else 0) + tmp) )//const del_mult(tmptk[off], start, prime) # 30k + 23 if tk23[pos]: prime = cpos + 23 p.append(prime) lastadded = 23 for off in offsets: tmp = d[(23, off)] start = (pos + prime) if off == 23 else (prime * (const * (pos + 1 if tmp < 23 else 0) + tmp) )//const del_mult(tmptk[off], start, prime) # 30k + 29 if tk29[pos]: prime = cpos + 29 p.append(prime) lastadded = 29 for off in offsets: tmp = d[(29, off)] start = (pos + prime) if off == 29 else (prime * (const * (pos + 1 if tmp < 29 else 0) + tmp) )//const del_mult(tmptk[off], start, prime) # now we go back to top tk1, so we need to increase pos by 1 pos += 1 cpos = const * pos # 30k + 1 if tk1[pos]: prime = cpos + 1 p.append(prime) lastadded = 1 for off in offsets: tmp = d[(1, off)] start = (pos + prime) if off == 1 else (prime * (const * pos + tmp) )//const del_mult(tmptk[off], start, prime) # time to add remaining primes # if lastadded == 1, remove last element and start adding them from tk1 # this way we don't need an"if" within the last while if lastadded == 1: p.pop() # now complete for every other possible prime while pos < len(tk1): cpos = const * pos if tk1[pos]: p.append(cpos + 1) if tk7[pos]: p.append(cpos + 7) if tk11[pos]: p.append(cpos + 11) if tk13[pos]: p.append(cpos + 13) if tk17[pos]: p.append(cpos + 17) if tk19[pos]: p.append(cpos + 19) if tk23[pos]: p.append(cpos + 23) if tk29[pos]: p.append(cpos + 29) pos += 1 # remove exceeding if present pos = len(p) - 1 while p[pos] > N: pos -= 1 if pos < len(p) - 1: del p[pos+1:] # return p list return p def sieveOfEratosthenes(n): """sieveOfEratosthenes(n): return the list of the primes < n.""" # Code from: <[email protected]>, Nov 30 2006 # http://groups.google.com/group/comp.lang.python/msg/f1f10ced88c68c2d if n <= 2: return [] sieve = range(3, n, 2) top = len(sieve) for si in sieve: if si: bottom = (si*si - 3) // 2 if bottom >= top: break sieve[bottom::si] = [0] * -((bottom - top) // si) return [2] + [el for el in sieve if el] def sieveOfAtkin(end): """sieveOfAtkin(end): return a list of all the prime numbers <end using the Sieve of Atkin.""" # Code by Steve Krenzel, <[email protected]>, improved # Code: https://web.archive.org/web/20080324064651/http://krenzel.info/?p=83 # Info: http://en.wikipedia.org/wiki/Sieve_of_Atkin assert end > 0 lng = ((end-1) // 2) sieve = [False] * (lng + 1) x_max, x2, xd = int(sqrt((end-1)/4.0)), 0, 4 for xd in xrange(4, 8*x_max + 2, 8): x2 += xd y_max = int(sqrt(end-x2)) n, n_diff = x2 + y_max*y_max, (y_max << 1) - 1 if not (n & 1): n -= n_diff n_diff -= 2 for d in xrange((n_diff - 1) << 1, -1, -8): m = n % 12 if m == 1 or m == 5: m = n >> 1 sieve[m] = not sieve[m] n -= d x_max, x2, xd = int(sqrt((end-1) / 3.0)), 0, 3 for xd in xrange(3, 6 * x_max + 2, 6): x2 += xd y_max = int(sqrt(end-x2)) n, n_diff = x2 + y_max*y_max, (y_max << 1) - 1 if not(n & 1): n -= n_diff n_diff -= 2 for d in xrange((n_diff - 1) << 1, -1, -8): if n % 12 == 7: m = n >> 1 sieve[m] = not sieve[m] n -= d x_max, y_min, x2, xd = int((2 + sqrt(4-8*(1-end)))/4), -1, 0, 3 for x in xrange(1, x_max + 1): x2 += xd xd += 6 if x2 >= end: y_min = (((int(ceil(sqrt(x2 - end))) - 1) << 1) - 2) << 1 n, n_diff = ((x*x + x) << 1) - 1, (((x-1) << 1) - 2) << 1 for d in xrange(n_diff, y_min, -8): if n % 12 == 11: m = n >> 1 sieve[m] = not sieve[m] n += d primes = [2, 3] if end <= 3: return primes[:max(0,end-2)] for n in xrange(5 >> 1, (int(sqrt(end))+1) >> 1): if sieve[n]: primes.append((n << 1) + 1) aux = (n << 1) + 1 aux *= aux for k in xrange(aux, end, 2 * aux): sieve[k >> 1] = False s = int(sqrt(end)) + 1 if s % 2 == 0: s += 1 primes.extend([i for i in xrange(s, end, 2) if sieve[i >> 1]]) return primes def ambi_sieve_plain(n): s = range(3, n, 2) for m in xrange(3, int(n**0.5)+1, 2): if s[(m-3)/2]: for t in xrange((m*m-3)/2,(n>>1)-1,m): s[t]=0 return [2]+[t for t in s if t>0] def sundaram3(max_n): # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/2073279#2073279 numbers = range(3, max_n+1, 2) half = (max_n)//2 initial = 4 for step in xrange(3, max_n+1, 2): for i in xrange(initial, half, step): numbers[i-1] = 0 initial += 2*(step+1) if initial > half: return [2] + filter(None, numbers) ################################################################################ # Using Numpy: def ambi_sieve(n): # http://tommih.blogspot.com/2009/04/fast-prime-number-generator.html s = np.arange(3, n, 2) for m in xrange(3, int(n ** 0.5)+1, 2): if s[(m-3)/2]: s[(m*m-3)/2::m]=0 return np.r_[2, s[s>0]] def primesfrom3to(n): # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188 """ Returns a array of primes, p < n""" assert n>=2 sieve = np.ones(n/2, dtype=np.bool) for i in xrange(3,int(n**0.5)+1,2): if sieve[i/2]: sieve[i*i/2::i] = False return np.r_[2, 2*np.nonzero(sieve)[0][1::]+1] def primesfrom2to(n): # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188 """ Input n>=6, Returns a array of primes, 2 <= p < n""" sieve = np.ones(n/3 + (n%6==2), dtype=np.bool) sieve[0] = False for i in xrange(int(n**0.5)/3+1): if sieve[i]: k=3*i+1|1 sieve[ ((k*k)/3) ::2*k] = False sieve[(k*k+4*k-2*k*(i&1))/3::2*k] = False return np.r_[2,3,((3*np.nonzero(sieve)[0]+1)|1)] if __name__=='__main__': import itertools import sys def test(f1,f2,num): print('Testing {f1} and {f2} return same results'.format( f1=f1.func_name, f2=f2.func_name)) if not all([a==b for a,b in itertools.izip_longest(f1(num),f2(num))]): sys.exit("Error: %s(%s) != %s(%s)"%(f1.func_name,num,f2.func_name,num)) n=1000000 test(sieveOfAtkin,sieveOfEratosthenes,n) test(sieveOfAtkin,ambi_sieve,n) test(sieveOfAtkin,ambi_sieve_plain,n) test(sieveOfAtkin,sundaram3,n) test(sieveOfAtkin,sieve_wheel_30,n) test(sieveOfAtkin,primesfrom3to,n) test(sieveOfAtkin,primesfrom2to,n) test(sieveOfAtkin,rwh_primes,n) test(sieveOfAtkin,rwh_primes1,n) test(sieveOfAtkin,rwh_primes2,n) |
运行所有实现产生相同结果的脚本测试。
更快更内存的纯Python代码:
1 2 3 4 5 6 7 | def primes(n): """ Returns a list of primes < n""" sieve = [True] * n for i in range(3,int(n**0.5)+1,2): if sieve[i]: sieve[i*i::2*i]=[False]*((n-i*i-1)//(2*i)+1) return [2] + [i for i in range(3,n,2) if sieve[i]] |
或者从半筛子开始
1 2 3 4 5 6 7 | def primes1(n): """ Returns a list of primes < n""" sieve = [True] * (n//2) for i in range(3,int(n**0.5)+1,2): if sieve[i//2]: sieve[i*i//2::i] = [False] * ((n-i*i-1)//(2*i)+1) return [2] + [2*i+1 for i in range(1,n//2) if sieve[i]] |
更快更内存的numpy代码:
1 2 3 4 5 6 7 8 | import numpy def primesfrom3to(n): """ Returns a array of primes, 3 <= p < n""" sieve = numpy.ones(n//2, dtype=numpy.bool) for i in range(3,int(n**0.5)+1,2): if sieve[i//2]: sieve[i*i//2::i] = False return 2*numpy.nonzero(sieve)[0][1::]+1 |
从三分之一筛子开始的更快的变化:
1 2 3 4 5 6 7 8 9 10 | import numpy def primesfrom2to(n): """ Input n>=6, Returns a array of primes, 2 <= p < n""" sieve = numpy.ones(n//3 + (n%6==2), dtype=numpy.bool) for i in range(1,int(n**0.5)//3+1): if sieve[i]: k=3*i+1|1 sieve[ k*k//3 ::2*k] = False sieve[k*(k-2*(i&1)+4)//3::2*k] = False return numpy.r_[2,3,((3*numpy.nonzero(sieve)[0][1:]+1)|1)] |
上述代码的纯Python版本(难以编码)为:
1 2 3 4 5 6 7 8 9 10 | def primes2(n): """ Input n>=6, Returns a list of primes, 2 <= p < n""" n, correction = n-n%6+6, 2-(n%6>1) sieve = [True] * (n//3) for i in range(1,int(n**0.5)//3+1): if sieve[i]: k=3*i+1|1 sieve[ k*k//3 ::2*k] = [False] * ((n//6-k*k//6-1)//k+1) sieve[k*(k-2*(i&1)+4)//3::2*k] = [False] * ((n//6-k*(k-2*(i&1)+4)//6-1)//k+1) return [2,3] + [3*i+1|1 for i in range(1,n//3-correction) if sieve[i]] |
不幸的是,纯python不采用更简单和更快的麻木方式来执行分配,并且在循环内部调用
我个人认为numpy(它被广泛使用)不是Python标准库的一部分是很遗憾的,而且Python开发人员似乎完全忽视了语法和速度方面的改进。
这里有一个来自python食谱的非常好的示例——该URL上建议的最快版本是:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | import itertools def erat2( ): D = { } yield 2 for q in itertools.islice(itertools.count(3), 0, None, 2): p = D.pop(q, None) if p is None: D[q*q] = q yield q else: x = p + q while x in D or not (x&1): x += p D[x] = p |
这样就可以
1 2 | def get_primes_erat(n): return list(itertools.takewhile(lambda p: p<n, erat2())) |
在shell提示下(我喜欢这样做)使用pri.py中的代码进行测量,我观察到:
1 2 3 4 | $ python2.5 -mtimeit -s'import pri' 'pri.get_primes(1000000)' 10 loops, best of 3: 1.69 sec per loop $ python2.5 -mtimeit -s'import pri' 'pri.get_primes_erat(1000000)' 10 loops, best of 3: 673 msec per loop |
所以看起来食谱解决方案的速度是原来的两倍多。
我想我用圣达拉姆的筛子打破了纯Python的记录:
1 2 3 4 5 6 7 8 9 10 11 12 | def sundaram3(max_n): numbers = range(3, max_n+1, 2) half = (max_n)//2 initial = 4 for step in xrange(3, max_n+1, 2): for i in xrange(initial, half, step): numbers[i-1] = 0 initial += 2*(step+1) if initial > half: return [2] + filter(None, numbers) |
比较:
1 2 3 4 5 6 7 8 | C:\USERS>python -m timeit -n10 -s"import get_primes""get_primes.get_primes_erat(1000000)" 10 loops, best of 3: 710 msec per loop C:\USERS>python -m timeit -n10 -s"import get_primes""get_primes.daniel_sieve_2(1000000)" 10 loops, best of 3: 435 msec per loop C:\USERS>python -m timeit -n10 -s"import get_primes""get_primes.sundaram3(1000000)" 10 loops, best of 3: 327 msec per loop |
该算法速度快,但存在严重缺陷:
1 2 3 4 5 6 7 8 9 10 11 | >>> sorted(get_primes(530)) [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 527, 529] >>> 17*31 527 >>> 23*23 529 |
您假设
对于具有足够大n的真正最快的解决方案,将下载预先计算的素数列表,将其存储为元组,并执行以下操作:
1 2 3 | for pos,i in enumerate(primes): if i > N: print primes[:pos] |
如果
总是跳出框框思考。
如果你不想重新发明轮子,你可以安装符号数学库sympy(是的,它与python 3兼容)。
1 | pip install sympy |
使用素数函数
1 2 | from sympy import sieve primes = list(sieve.primerange(1, 10**6)) |
如果您接受ITertools而不是numpy,这里是针对python 3的rwh-primes2的一个改编版本,它在我的机器上运行的速度是原来的两倍。唯一实质性的改变是使用bytearray代替布尔值的列表,使用compress代替列表理解来构建最终的列表。(如果我能的话,我会加上这个评论,就像摩尔宁松一样。)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | import itertools izip = itertools.zip_longest chain = itertools.chain.from_iterable compress = itertools.compress def rwh_primes2_python3(n): """ Input n>=6, Returns a list of primes, 2 <= p < n""" zero = bytearray([False]) size = n//3 + (n % 6 == 2) sieve = bytearray([True]) * size sieve[0] = False for i in range(int(n**0.5)//3+1): if sieve[i]: k=3*i+1|1 start = (k*k+4*k-2*k*(i&1))//3 sieve[(k*k)//3::2*k]=zero*((size - (k*k)//3 - 1) // (2 * k) + 1) sieve[ start ::2*k]=zero*((size - start - 1) // (2 * k) + 1) ans = [2,3] poss = chain(izip(*[range(i, n, 6) for i in (1,5)])) ans.extend(compress(poss, sieve)) return ans |
比较:
1 2 3 4 | >>> timeit.timeit('primes.rwh_primes2(10**6)', setup='import primes', number=1) 0.0652179726976101 >>> timeit.timeit('primes.rwh_primes2_python3(10**6)', setup='import primes', number=1) 0.03267321276325674 |
和
1 2 3 4 | >>> timeit.timeit('primes.rwh_primes2(10**8)', setup='import primes', number=1) 6.394284538007014 >>> timeit.timeit('primes.rwh_primes2_python3(10**8)', setup='import primes', number=1) 3.833829450302801 |
编写自己的主要查找代码很有指导意义,但是手头上有一个快速可靠的库也很有用。我在C++库PrimeScript周围写了一个包装器,命名为PrimeSpyPython。
试试看1〔0〕。
1 2 | import primesieve primes = primesieve.generate_primes(10**8) |
我很想看看速度对比。
这里有两个最快函数之一的更新(纯python 3.6)版本,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | from itertools import compress def rwh_primes1v1(n): """ Returns a list of primes < n for n > 2""" sieve = bytearray([True]) * (n//2) for i in range(3,int(n**0.5)+1,2): if sieve[i//2]: sieve[i*i//2::i] = bytearray((n-i*i-1)//(2*i)+1) return [2,*compress(range(3,n,2), sieve[1:])] def rwh_primes1v2(n): """ Returns a list of primes < n for n > 2""" sieve = bytearray([True]) * (n//2+1) for i in range(1,int(n**0.5)//2+1): if sieve[i]: sieve[2*i*(i+1)::2*i+1] = bytearray((n//2-2*i*(i+1))//(2*i+1)+1) return [2,*compress(range(3,n,2), sieve[1:])] |
基于n<9080191的米勒-拉宾初等性检验的确定性实现
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | import sys import random def miller_rabin_pass(a, n): d = n - 1 s = 0 while d % 2 == 0: d >>= 1 s += 1 a_to_power = pow(a, d, n) if a_to_power == 1: return True for i in xrange(s-1): if a_to_power == n - 1: return True a_to_power = (a_to_power * a_to_power) % n return a_to_power == n - 1 def miller_rabin(n): for a in [2, 3, 37, 73]: if not miller_rabin_pass(a, n): return False return True n = int(sys.argv[1]) primes = [2] for p in range(3,n,2): if miller_rabin(p): primes.append(p) print len(primes) |
根据维基百科上的文章(http://en.wikipedia.org/wiki/miller–rabin_primarity_test),测试n<9080191的a=2、3、37和73足以决定n是否是复合的。
我改编了原始Miller-Rabin测试的概率实现的源代码:http://en.literateprograms.org/miller-rabin-primary-u-test-uu(python)
如果你控制了n,列出所有素数的最快方法就是预先计算它们。说真的。预计算是一种被忽视的优化方法。
下面是我在python中通常用来生成素数的代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | $ python -mtimeit -s'import sieve' 'sieve.sieve(1000000)' 10 loops, best of 3: 445 msec per loop $ cat sieve.py from math import sqrt def sieve(size): prime=[True]*size rng=xrange limit=int(sqrt(size)) for i in rng(3,limit+1,+2): if prime[i]: prime[i*i::+i]=[False]*len(prime[i*i::+i]) return [2]+[i for i in rng(3,size,+2) if prime[i]] if __name__=='__main__': print sieve(100) |
它不能与这里发布的更快的解决方案竞争,但至少它是纯Python。
感谢您发布此问题。我今天真的学到了很多。
这些都是经过编写和测试的。所以没有必要重新发明轮子。
1 | python -m timeit -r10 -s"from sympy import sieve""primes = list(sieve.primerange(1, 10**6))" |
给我们创纪录的12.2毫秒!
1 | 10 loops, best of 10: 12.2 msec per loop |
如果速度不够快,可以尝试pypy:
1 | pypy -m timeit -r10 -s"from sympy import sieve""primes = list(sieve.primerange(1, 10**6))" |
结果是:
1 | 10 loops, best of 10: 2.03 msec per loop |
247张以上选票的答案列出了15.9毫秒的最佳解决方案。比较一下!!!!
我测试了一些unutbu的函数,我用几亿的数字计算它。
胜利者是使用numpy库的函数,
注意:进行内存利用率测试也很有趣:)
样例代码
在我的Github存储库上完成代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 | #!/usr/bin/env python import lib import timeit import sys import math import datetime import prettyplotlib as ppl import numpy as np import matplotlib.pyplot as plt from prettyplotlib import brewer2mpl primenumbers_gen = [ 'sieveOfEratosthenes', 'ambi_sieve', 'ambi_sieve_plain', 'sundaram3', 'sieve_wheel_30', 'primesfrom3to', 'primesfrom2to', 'rwh_primes', 'rwh_primes1', 'rwh_primes2', ] def human_format(num): # https://stackoverflow.com/questions/579310/formatting-long-numbers-as-strings-in-python?answertab=active#tab-top magnitude = 0 while abs(num) >= 1000: magnitude += 1 num /= 1000.0 # add more suffixes if you need them return '%.2f%s' % (num, ['', 'K', 'M', 'G', 'T', 'P'][magnitude]) if __name__=='__main__': # Vars n = 10000000 # number itereration generator nbcol = 5 # For decompose prime number generator nb_benchloop = 3 # Eliminate false positive value during the test (bench average time) datetimeformat = '%Y-%m-%d %H:%M:%S.%f' config = 'from __main__ import n; import lib' primenumbers_gen = { 'sieveOfEratosthenes': {'color': 'b'}, 'ambi_sieve': {'color': 'b'}, 'ambi_sieve_plain': {'color': 'b'}, 'sundaram3': {'color': 'b'}, 'sieve_wheel_30': {'color': 'b'}, # # # 'primesfrom2to': {'color': 'b'}, 'primesfrom3to': {'color': 'b'}, # 'rwh_primes': {'color': 'b'}, # 'rwh_primes1': {'color': 'b'}, 'rwh_primes2': {'color': 'b'}, } # Get n in command line if len(sys.argv)>1: n = int(sys.argv[1]) step = int(math.ceil(n / float(nbcol))) nbs = np.array([i * step for i in range(1, int(nbcol) + 1)]) set2 = brewer2mpl.get_map('Paired', 'qualitative', 12).mpl_colors print datetime.datetime.now().strftime(datetimeformat) print("Compute prime number to %(n)s" % locals()) print("") results = dict() for pgen in primenumbers_gen: results[pgen] = dict() benchtimes = list() for n in nbs: t = timeit.Timer("lib.%(pgen)s(n)" % locals(), setup=config) execute_times = t.repeat(repeat=nb_benchloop,number=1) benchtime = np.mean(execute_times) benchtimes.append(benchtime) results[pgen] = {'benchtimes':np.array(benchtimes)} fig, ax = plt.subplots(1) plt.ylabel('Computation time (in second)') plt.xlabel('Numbers computed') i = 0 for pgen in primenumbers_gen: bench = results[pgen]['benchtimes'] avgs = np.divide(bench,nbs) avg = np.average(bench, weights=nbs) # Compute linear regression A = np.vstack([nbs, np.ones(len(nbs))]).T a, b = np.linalg.lstsq(A, nbs*avgs)[0] # Plot i += 1 #label="%(pgen)s" % locals() #ppl.plot(nbs, nbs*avgs, label=label, lw=1, linestyle='--', color=set2[i % 12]) label="%(pgen)s avg" % locals() ppl.plot(nbs, a * nbs + b, label=label, lw=2, color=set2[i % 12]) print datetime.datetime.now().strftime(datetimeformat) ppl.legend(ax, loc='upper left', ncol=4) # Change x axis label ax.get_xaxis().get_major_formatter().set_scientific(False) fig.canvas.draw() labels = [human_format(int(item.get_text())) for item in ax.get_xticklabels()] ax.set_xticklabels(labels) ax = plt.gca() plt.show() |
对于Python 3
1 2 3 4 5 6 7 8 9 10 11 | def rwh_primes2(n): correction = (n%6>1) n = {0:n,1:n-1,2:n+4,3:n+3,4:n+2,5:n+1}[n%6] sieve = [True] * (n//3) sieve[0] = False for i in range(int(n**0.5)//3+1): if sieve[i]: k=3*i+1|1 sieve[ ((k*k)//3) ::2*k]=[False]*((n//6-(k*k)//6-1)//k+1) sieve[(k*k+4*k-2*k*(i&1))//3::2*k]=[False]*((n//6-(k*k+4*k-2*k*(i&1))//6-1)//k+1) return [2,3] + [3*i+1|1 for i in range(1,n//3-correction) if sieve[i]] |
使用numpy的半筛网稍微不同的实现:
http://rebrained.com/?P=458
1 2 3 4 5 6 7 8 | import math import numpy def prime6(upto): primes=numpy.arange(3,upto+1,2) isprime=numpy.ones((upto-1)/2,dtype=bool) for factor in primes[:int(math.sqrt(upto))]: if isprime[(factor-2)/2]: isprime[(factor*3-2)/2:(upto-1)/2:factor]=0 return numpy.insert(primes[isprime],0,2) |
有人能把这个和其他时间比较吗?在我的机器上,它似乎相当于另一个麻木的半筛子。
对于速度最快的代码,numpy解决方案是最好的。不过,出于纯粹的学术原因,我发布了纯Python版本,比上面发布的食谱版本快了不到50%。因为我在内存中列出了整个列表,所以您需要足够的空间来容纳所有内容,但它的伸缩性似乎相当好。
1 2 3 4 5 6 7 8 9 10 11 12 13 | def daniel_sieve_2(maxNumber): """ Given a number, returns all numbers less than or equal to that number which are prime. """ allNumbers = range(3, maxNumber+1, 2) for mIndex, number in enumerate(xrange(3, maxNumber+1, 2)): if allNumbers[mIndex] == 0: continue # now set all multiples to 0 for index in xrange(mIndex+number, (maxNumber-3)/2+1, number): allNumbers[index] = 0 return [2] + filter(lambda n: n!=0, allNumbers) |
结果是:
1 2 3 4 5 6 7 8 | >>>mine = timeit.Timer("daniel_sieve_2(1000000)", ... "from sieves import daniel_sieve_2") >>>prev = timeit.Timer("get_primes_erat(1000000)", ... "from sieves import get_primes_erat") >>>print"Mine: {0:0.4f} ms".format(min(mine.repeat(3, 1))*1000) Mine: 428.9446 ms >>>print"Previous Best {0:0.4f} ms".format(min(prev.repeat(3, 1))*1000) Previous Best 621.3581 ms |
第一次使用python,所以我在这里使用的一些方法看起来有点麻烦。我只是直接把我的C++代码转换成Python,这就是我所拥有的(虽然在Python中有点慢)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | #!/usr/bin/env python import time def GetPrimes(n): Sieve = [1 for x in xrange(n)] Done = False w = 3 while not Done: for q in xrange (3, n, 2): Prod = w*q if Prod < n: Sieve[Prod] = 0 else: break if w > (n/2): Done = True w += 2 return Sieve start = time.clock() d = 10000000 Primes = GetPrimes(d) count = 1 #This is for 2 for x in xrange (3, d, 2): if Primes[x]: count+=1 elapsed = (time.clock() - start) print" Found", count,"primes in", elapsed,"seconds! " |
pythonw Primes.py
Found 664579 primes in 12.799119 seconds!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | #!/usr/bin/env python import time def GetPrimes2(n): Sieve = [1 for x in xrange(n)] for q in xrange (3, n, 2): k = q for y in xrange(k*3, n, k*2): Sieve[y] = 0 return Sieve start = time.clock() d = 10000000 Primes = GetPrimes2(d) count = 1 #This is for 2 for x in xrange (3, d, 2): if Primes[x]: count+=1 elapsed = (time.clock() - start) print" Found", count,"primes in", elapsed,"seconds! " |
pythonw Primes2.py
Found 664579 primes in 10.230172 seconds!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | #!/usr/bin/env python import time def GetPrimes3(n): Sieve = [1 for x in xrange(n)] for q in xrange (3, n, 2): k = q for y in xrange(k*k, n, k << 1): Sieve[y] = 0 return Sieve start = time.clock() d = 10000000 Primes = GetPrimes3(d) count = 1 #This is for 2 for x in xrange (3, d, 2): if Primes[x]: count+=1 elapsed = (time.clock() - start) print" Found", count,"primes in", elapsed,"seconds! " |
python Primes2.py
Found 664579 primes in 7.113776 seconds!
我知道比赛已经结束几年了。…
不过,这是我对纯Python基本筛选的建议,因为在向前处理筛选时,使用适当的步骤省略了2、3和5的倍数。尽管如此,n<10^9比@robert william hanks superior solutions rwh_primes2和rwh_primes1慢。通过使用1.5*10^8以上的ctypes.c_-ushort筛网阵列,它可以在某种程度上适应内存限制。
10 ^ 6
$python-mtimeit-s"导入primesheisepedcomp""primesheisepedcomp.primesheiseq(1000000)"。10个回路,最好是3个:每个回路46.7毫秒
to compare:$ python -mtimeit -s"import primeSieveSpeedComp"
"primeSieveSpeedComp.rwh_primes1(1000000)" 10 loops, best of 3: 43.2
msec per loop
to compare: $ python -m timeit -s"import primeSieveSpeedComp"
"primeSieveSpeedComp.rwh_primes2(1000000)" 10 loops, best of 3: 34.5
msec per loop
10 ^ 7
$python-mtimeit-s"导入primesheisepedcomp""primesheisepedcomp.primesheiseq(10000000)"。10个循环,每个循环最好3:530毫秒
to compare:$ python -mtimeit -s"import primeSieveSpeedComp"
"primeSieveSpeedComp.rwh_primes1(10000000)" 10 loops, best of 3: 494
msec per loop
to compare: $ python -m timeit -s"import primeSieveSpeedComp"
"primeSieveSpeedComp.rwh_primes2(10000000)" 10 loops, best of 3: 375
msec per loop
10 ^ 8
$python-mtimeit-s"导入primesheisepedcomp""primesheisepedcomp.primesheiseq(100000000)"。10圈,最好是每圈3:5.55秒
to compare: $ python -mtimeit -s"import primeSieveSpeedComp"
"primeSieveSpeedComp.rwh_primes1(100000000)" 10 loops, best of 3: 5.33
sec per loop
to compare: $ python -m timeit -s"import primeSieveSpeedComp"
"primeSieveSpeedComp.rwh_primes2(100000000)" 10 loops, best of 3: 3.95
sec per loop
10 ^ 9
$python-mtimeit-s"导入primesheisepedcomp""primesheisepedcomp.primesheiseq(100000000)"。10圈,最好是3圈:每圈61.2秒
to compare: $ python -mtimeit -n 3 -s"import primeSieveSpeedComp"
"primeSieveSpeedComp.rwh_primes1(1000000000)" 3 loops, best of 3: 97.8
sec per loopto compare: $ python -m timeit -s"import primeSieveSpeedComp"
"primeSieveSpeedComp.rwh_primes2(1000000000)" 10 loops, best of 3:
41.9 sec per loop
您可以将下面的代码复制到Ubuntus primeshespeedcomp中,以查看此测试。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 | def primeSieveSeq(MAX_Int): if MAX_Int > 5*10**8: import ctypes int16Array = ctypes.c_ushort * (MAX_Int >> 1) sieve = int16Array() #print 'uses ctypes"unsigned short int Array"' else: sieve = (MAX_Int >> 1) * [False] #print 'uses python list() of long long int' if MAX_Int < 10**8: sieve[4::3] = [True]*((MAX_Int - 8)/6+1) sieve[12::5] = [True]*((MAX_Int - 24)/10+1) r = [2, 3, 5] n = 0 for i in xrange(int(MAX_Int**0.5)/30+1): n += 3 if not sieve[n]: n2 = (n << 1) + 1 r.append(n2) n2q = (n2**2) >> 1 sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1) n += 2 if not sieve[n]: n2 = (n << 1) + 1 r.append(n2) n2q = (n2**2) >> 1 sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1) n += 1 if not sieve[n]: n2 = (n << 1) + 1 r.append(n2) n2q = (n2**2) >> 1 sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1) n += 2 if not sieve[n]: n2 = (n << 1) + 1 r.append(n2) n2q = (n2**2) >> 1 sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1) n += 1 if not sieve[n]: n2 = (n << 1) + 1 r.append(n2) n2q = (n2**2) >> 1 sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1) n += 2 if not sieve[n]: n2 = (n << 1) + 1 r.append(n2) n2q = (n2**2) >> 1 sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1) n += 3 if not sieve[n]: n2 = (n << 1) + 1 r.append(n2) n2q = (n2**2) >> 1 sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1) n += 1 if not sieve[n]: n2 = (n << 1) + 1 r.append(n2) n2q = (n2**2) >> 1 sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1) if MAX_Int < 10**8: return [2, 3, 5]+[(p << 1) + 1 for p in [n for n in xrange(3, MAX_Int >> 1) if not sieve[n]]] n = n >> 1 try: for i in xrange((MAX_Int-2*n)/30 + 1): n += 3 if not sieve[n]: r.append((n << 1) + 1) n += 2 if not sieve[n]: r.append((n << 1) + 1) n += 1 if not sieve[n]: r.append((n << 1) + 1) n += 2 if not sieve[n]: r.append((n << 1) + 1) n += 1 if not sieve[n]: r.append((n << 1) + 1) n += 2 if not sieve[n]: r.append((n << 1) + 1) n += 3 if not sieve[n]: r.append((n << 1) + 1) n += 1 if not sieve[n]: r.append((n << 1) + 1) except: pass return r |
我猜所有方法中最快的就是在代码中硬编码素数。
所以,为什么不写一个缓慢的脚本来生成另一个源文件,其中包含所有数字,然后在运行实际程序时导入该源文件呢?
当然,只有在编译时知道n的上限,这才有效,但是(几乎)所有项目Euler问题都是这样。
nbsp;
PS:我可能是错的,虽然iff用硬连线素数解析源代码比最初计算它们慢,但据我所知,python运行于编译的
很抱歉打扰您,但是erat2()在算法中有一个严重的缺陷。
在搜索下一个组合时,我们只需要测试奇数。q,p都是奇数;那么q+p是偶数,不需要测试,但q+2*p总是奇数。这就消除了while循环条件下的"if-even"测试,并节省了大约30%的运行时间。
当我们这样做的时候:不是优雅的"d.pop(q,none)"get and delete方法,而是使用"if q in d:p=d[q],del d[q]",速度是原来的两倍!至少在我的机器上(P3-1GHz)。所以我建议这个聪明的算法的实现:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | def erat3( ): from itertools import islice, count # q is the running integer that's checked for primeness. # yield 2 and no other even number thereafter yield 2 D = {} # no need to mark D[4] as we will test odd numbers only for q in islice(count(3),0,None,2): if q in D: # is composite p = D[q] del D[q] # q is composite. p=D[q] is the first prime that # divides it. Since we've reached q, we no longer # need it in the map, but we'll mark the next # multiple of its witnesses to prepare for larger # numbers. x = q + p+p # next odd(!) multiple while x in D: # skip composites x += p+p D[x] = p else: # is prime # q is a new prime. # Yield it and mark its first multiple that isn't # already marked in previous iterations. D[q*q] = q yield q |
我可能会迟到,但必须为此添加我自己的代码。它在空间中使用大约n/2,因为我们不需要存储偶数,而且我还使用了bitarray python模块,进一步大大减少了内存消耗,使所有素数的计算都达到1000000000。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | from bitarray import bitarray def primes_to(n): size = n//2 sieve = bitarray(size) sieve.setall(1) limit = int(n**0.5) for i in range(1,limit): if sieve[i]: val = 2*i+1 sieve[(i+i*val)::val] = 0 return [2] + [2*i+1 for i, v in enumerate(sieve) if v and i > 0] python -m timeit -n10 -s"import euler""euler.primes_to(1000000000)" 10 loops, best of 3: 46.5 sec per loop |
这是在64位2.4GHz Mac OSX 10.8.3上运行的。
这是一种数量庞大的埃拉托森筛子,它既具有很好的复杂性(低于对长度n的数组进行排序)又具有矢量化。与@unutbu相比,这个速度与46个微秒的软件包一样快,可以找到100万以下的所有素数。
1 2 3 4 5 6 7 8 | import numpy as np def generate_primes(n): is_prime = np.ones(n+1,dtype=bool) is_prime[0:2] = False for i in range(int(n**0.5)+1): if is_prime[i]: is_prime[i*2::i]=False return np.where(is_prime)[0] |
计时:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | import time for i in range(2,10): timer =time.time() generate_primes(10**i) print('n = 10^',i,' time =', round(time.time()-timer,6)) >> n = 10^ 2 time = 5.6e-05 >> n = 10^ 3 time = 6.4e-05 >> n = 10^ 4 time = 0.000114 >> n = 10^ 5 time = 0.000593 >> n = 10^ 6 time = 0.00467 >> n = 10^ 7 time = 0.177758 >> n = 10^ 8 time = 1.701312 >> n = 10^ 9 time = 19.322478 |
随着时间的推移,我收集了几个质数筛。我电脑上最快的是:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | from time import time # 175 ms for all the primes up to the value 10**6 def primes_sieve(limit): a = [True] * limit a[0] = a[1] = False #a[2] = True for n in xrange(4, limit, 2): a[n] = False root_limit = int(limit**.5)+1 for i in xrange(3,root_limit): if a[i]: for n in xrange(i*i, limit, 2*i): a[n] = False return a LIMIT = 10**6 s=time() primes = primes_sieve(LIMIT) print time()-s |
我对这个问题反应缓慢,但这似乎是一个有趣的练习。我用的是numpy,这可能是作弊,我怀疑这个方法是最快的,但应该很清楚。它只筛选一个引用其索引的布尔数组,并从所有真值的索引中提取素数。不需要模件。
1 2 3 4 5 6 7 8 9 | import numpy as np def ajs_primes3a(upto): mat = np.ones((upto), dtype=bool) mat[0] = False mat[1] = False mat[4::2] = False for idx in range(3, int(upto ** 0.5)+1, 2): mat[idx*2::idx] = False return np.where(mat == True)[0] |
下面是一种使用Python的列表理解生成素数(但不是最有效的)的有趣技术:
1 2 | noprimes = [j for i in range(2, 8) for j in range(i*2, 50, i)] primes = [x for x in range(2, 50) if x not in noprimes] |
你可以在这里找到例子和一些解释
到目前为止,我尝试过的最快的方法是基于python cookbook
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | import itertools as it def erat2a( ): D = { } yield 2 for q in it.islice(it.count(3), 0, None, 2): p = D.pop(q, None) if p is None: D[q*q] = q yield q else: x = q + 2*p while x in D: x += 2*p D[x] = p |
有关加速的解释,请参阅此答案。
一般来说,如果需要快速的数字计算,python不是最佳选择。现在有很多更快(更复杂)的算法。例如,在我的计算机上,我得到了2.2秒的代码,而Mathematica得到了0.088005。
首先:你需要布景吗?
这是一个使用存储列表查找素数的优雅而简单的解决方案。从4个变量开始,你只需要测试奇数素数的除数,你只需要测试作为素数测试的数字的一半(测试9,11,13是否被17除无意义)。它测试以前存储的素数作为除数。`
1 2 3 4 5 6 7 8 9 | # Program to calculate Primes primes = [1,3,5,7] for n in range(9,100000,2): for x in range(1,(len(primes)/2)): if n % primes[x] == 0: break else: primes.append(n) print primes |
这就是你可以和别人比较的方法。
1 2 3 4 5 | # You have to list primes upto n nums = xrange(2, n) for i in range(2, 10): nums = filter(lambda s: s==i or s%i, nums) print nums |
这么简单…