多线程模块比python中的多处理慢

Multithreading module slower than multiprocessing in python

本问题已经有最佳答案,请猛点这里访问。

我发现线程模块对同一任务的处理时间比多处理要长。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
import time
import threading
import multiprocessing

def func():
    result = 0
    for i in xrange(10**8):
        result += i

num_jobs = 10


# 1. measure, how long it takes 10 consistent func() executions
t = time.time()
for _ in xrange(num_jobs):
    func()
print ("10 consistent func executions takes:
{:.2f} seconds
"
.format(time.time() - t))        

# 2. threading module, 10 jobs
jobs = []
for _ in xrange(num_jobs):
    jobs.append(threading.Thread(target = func, args=()))
t = time.time()
for job in jobs:
    job.start()
for job in jobs:
    job.join()      
print ("10 func executions in parallel (threading module) takes:
{:.2f} seconds
"
.format(time.time() - t))        

# 3. multiprocessing module, 10 jobs
jobs = []
for _ in xrange(num_jobs):
    jobs.append(multiprocessing.Process(target = func, args=()))
t = time.time()
for job in jobs:
    job.start()
for job in jobs:
    job.join()
print ("10  func executions in parallel (multiprocessing module) takes:
{:.2f} seconds
"
.format(time.time() - t))

结果:

10 consistent func executions takes:
25.66 seconds

10 func executions in parallel (threading module) takes:
46.00 seconds

10 func executions in parallel (multiprocessing module) takes:
7.92 seconds

1)为什么使用多处理模块的实现比使用线程模块更好?

2)为什么一致的func执行比使用线程模块花费更少的时间?


您的两个问题都可以通过以下文档的摘录来回答:

The mechanism used by the CPython interpreter to assure that only one thread executes Python bytecode at a time. This simplifies the CPython implementation by making the object model (including critical built-in types such as dict) implicitly safe against concurrent access. Locking the entire interpreter makes it easier for the interpreter to be multi-threaded, at the expense of much of the parallelism afforded by multi-processor machines.

大胆强调我的。您最终会花费大量时间在线程之间切换,以确保它们都运行到完成。就像把一根绳子切成小块,然后再把它们绑在一起,一次一根绳子。

由于每个进程都在自己的地址空间中执行,所以多处理模块的结果与预期的一样。这些进程彼此独立,除了在同一个进程组中并且具有相同的父进程之外。