Multithreading module slower than multiprocessing in python
我发现线程模块对同一任务的处理时间比多处理要长。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | import time import threading import multiprocessing def func(): result = 0 for i in xrange(10**8): result += i num_jobs = 10 # 1. measure, how long it takes 10 consistent func() executions t = time.time() for _ in xrange(num_jobs): func() print ("10 consistent func executions takes: {:.2f} seconds ".format(time.time() - t)) # 2. threading module, 10 jobs jobs = [] for _ in xrange(num_jobs): jobs.append(threading.Thread(target = func, args=())) t = time.time() for job in jobs: job.start() for job in jobs: job.join() print ("10 func executions in parallel (threading module) takes: {:.2f} seconds ".format(time.time() - t)) # 3. multiprocessing module, 10 jobs jobs = [] for _ in xrange(num_jobs): jobs.append(multiprocessing.Process(target = func, args=())) t = time.time() for job in jobs: job.start() for job in jobs: job.join() print ("10 func executions in parallel (multiprocessing module) takes: {:.2f} seconds ".format(time.time() - t)) |
结果:
10 consistent func executions takes:
25.66 seconds 10 func executions in parallel (threading module) takes:
46.00 seconds 10 func executions in parallel (multiprocessing module) takes:
7.92 seconds
1)为什么使用多处理模块的实现比使用线程模块更好?
2)为什么一致的func执行比使用线程模块花费更少的时间?
您的两个问题都可以通过以下文档的摘录来回答:
The mechanism used by the CPython interpreter to assure that only one thread executes Python bytecode at a time. This simplifies the CPython implementation by making the object model (including critical built-in types such as dict) implicitly safe against concurrent access. Locking the entire interpreter makes it easier for the interpreter to be multi-threaded, at the expense of much of the parallelism afforded by multi-processor machines.
大胆强调我的。您最终会花费大量时间在线程之间切换,以确保它们都运行到完成。就像把一根绳子切成小块,然后再把它们绑在一起,一次一根绳子。
由于每个进程都在自己的地址空间中执行,所以多处理模块的结果与预期的一样。这些进程彼此独立,除了在同一个进程组中并且具有相同的父进程之外。