python将长时间运行的进程传递给另一个脚本

python pass long running process to another script

我尝试运行两个python脚本，script1调用script2，script2是一个长时间运行的过程，它实时地将内容传递回script1。

以下是脚本1：

1
2
3
4
5
6

from script2 import Test2

model_info = Test2()
info = model_info.test2_run()

print info

下面是脚本2：

1
2
3
4
5
6
7
8

class Test2:
def __init__(self):
print("running")

def test2_run(self):
a = 100000
for line in range(a):
return line

如何让script2不断地将line反馈回script1？

相关讨论

下面给出了实现这一目标的四种不同方法。假设您有两个python脚本，producer.py和consumer.py，如下所示。

生产商

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57

import multiprocessing
import threading
range_limit = 3

class LineProducer(object):
def __init__(self, msg = ''):
print(self.prepare_line('Initializing %s%s' % (self.__class__.__name__, msg)))

def prepare_line(self, line):
return '%d - %d : %s' % (multiprocessing.current_process().pid, threading.current_thread().ident, line)

def run(self):
for line in range(range_limit):
yield self.prepare_line(line)

class MultiThreadedLineProducer(LineProducer):
def produce(self, q):
for line in range(range_limit):
q.put(self.prepare_line(line))

q.put(None)

def run(self):
q = multiprocessing.Queue()
threading.Thread(target = self.produce, args = (q,)).start()

while 1:
line = q.get(True)

if line == None:
break;

yield line

class MultiProcessedLineProducer(LineProducer):
def produce(self, q):
for line in range(range_limit):
q.put(self.prepare_line(line))

q.put(None)

def run(self):
q = multiprocessing.Queue()
multiprocessing.Process(target = self.produce, args=(q,)).start()

while 1:
line = q.get(True)

if line == None:
break;

yield line

if __name__ == '__main__':
for line in LineProducer(' inside a separate process').run():
print(line)

消费品

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

import sys
from subprocess import Popen, PIPE, STDOUT
from producer import LineProducer, MultiThreadedLineProducer, MultiProcessedLineProducer

#using normal yield
for line in LineProducer().run():
sys.stdout.write(line + '
')

#using yield with multi threading
for line in MultiThreadedLineProducer().run():
sys.stdout.write(line + '
')

#using yield with mult processing
for line in MultiProcessedLineProducer().run():
sys.stdout.write(line + '
')

#using normal yield in child process
for line in Popen(['python', 'producer.py'], bufsize = 0, shell = False, stdout = PIPE, stderr = STDOUT).stdout:
sys.stdout.write(line)

现在，如果您执行python consumer.py，它将产生类似于下面给出的输出

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

8834 - 140419442169664 : Initializing LineProducer
8834 - 140419442169664 : 0
8834 - 140419442169664 : 1
8834 - 140419442169664 : 2
8834 - 140419442169664 : Initializing MultiThreadedLineProducer
8834 - 140419409151744 : 0
8834 - 140419409151744 : 1
8834 - 140419409151744 : 2
8834 - 140419442169664 : Initializing MultiProcessedLineProducer
8837 - 140419442169664 : 0
8837 - 140419442169664 : 1
8837 - 140419442169664 : 2
8839 - 140280258066240 : Initializing LineProducer inside a separate process
8839 - 140280258066240 : 0
8839 - 140280258066240 : 1
8839 - 140280258066240 : 2

输出格式为PID - ThreadID : Message，其中PID是进程ID，ThreadID是从中生成Message的线程标识符。

现在您可以看到，第一组输出的所有行的PID和threaddid都相同。在这里，供给是基于需求的，即当消费者要求时，就会产生一条输出线。

在第二组中，PID保持不变，但对于生成的行具有不同的threadid。这是因为，生产者和消费者在不同的线程中运行。生产线不考虑消费者的需求。python中的一个线程使用本机线程，比如pthread，但是由于全局解释器锁的存在，没有两个线程可以同时运行，这意味着您将无法获得真正意义上的并行性。

现在，到第三组，PID是不同的，这意味着消费者运行在一个不同的过程中，这是从当前的过程分叉出来的。这使得真正的并行性，可以有效地利用多个CPU核心。与多线程一样，无论用户需求如何，这也会生成行。

多线程和多处理使用队列在线程/进程之间进行通信。您可以通过在创建队列时指定项目数来限制行的生产。这样，就产生了行，直到队列满为止。生产将随着队列中的行被消耗而恢复。

现在，在最后一个集合中，它使用fork/exec机制创建一个进程，并用指定的可执行文件替换映像。它与第一组相同，但具有不同的PID和线程ID。此方法与第三种方法的区别在于，不能使用队列在进程之间进行通信，并且依赖管道和其他IPC机制。另外，producer.py应该是一个可执行的python脚本。在这种情况下，生产线与消费者的需求无关。