关于subprocess：停止在Python中读取进程输出而不挂起？

Stop reading process output in Python without hang?

我有一个Linux的Python程序几乎看起来像这样：

1
2
3
4
5
6
7
8
9
10

import os
import time

process = os.popen("top").readlines()

time.sleep(1)

os.popen("killall top")

print process

该程序挂起在这一行：

1	process = os.popen("top").readlines()

这种情况发生在保持更新输出的工具中，如"Top"

我最好的考验：

1
2
3
4
5
6
7
8
9
10
11

import os
import time
import subprocess

process = subprocess.Popen('top')

time.sleep(2)

os.popen("killall top")

print process

它比第一个(它被砍掉)效果更好，但它返回：

1	<subprocess.Popen object at 0x97a50cc>

第二次试验：

1
2
3
4
5
6
7
8
9
10
11

import os
import time
import subprocess

process = subprocess.Popen('top').readlines()

time.sleep(2)

os.popen("killall top")

print process

和第一个一样。它由于"readlines()"而被绞死

它的返回应该是这样的：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

top - 05:31:15 up 12:12, 5 users, load average: 0.25, 0.14, 0.11
Tasks: 174 total, 2 running, 172 sleeping, 0 stopped, 0 zombie
Cpu(s): 9.3%us, 3.8%sy, 0.1%ni, 85.9%id, 0.9%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1992828k total, 1849456k used, 143372k free, 233048k buffers
Swap: 4602876k total, 0k used, 4602876k free, 1122780k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31735 Barakat 20 0 246m 52m 20m S 19.4 2.7 13:54.91 totem
1907 root 20 0 91264 45m 15m S 1.9 2.3 38:54.14 Xorg
2138 Barakat 20 0 17356 5368 4284 S 1.9 0.3 3:00.15 at-spi-registry
2164 Barakat 9 -11 164m 7372 6252 S 1.9 0.4 2:54.58 pulseaudio
2394 Barakat 20 0 27212 9792 8256 S 1.9 0.5 6:01.48 multiload-apple
6498 Barakat 20 0 56364 30m 18m S 1.9 1.6 0:03.38 pyshell
1 root 20 0 2880 1416 1208 S 0.0 0.1 0:02.02 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.02 kthreadd
3 root RT 0 0 0 0 S 0.0 0.0 0:00.12 migration/0
4 root 20 0 0 0 0 S 0.0 0.0 0:02.07 ksoftirqd/0
5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
9 root 20 0 0 0 0 S 0.0 0.0 0:01.43 events/0
11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuset
12 root 20 0 0 0 0 S 0.0 0.0 0:00.02 khelper
13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 netns
14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 async/mgr
15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pm

并保存在变量"process"中。任何我想到的人，我现在真的被困住了吗？

相关讨论

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

#!/usr/bin/env python
"""Start process; wait 2 seconds; kill the process; print all process output."""
import subprocess
import tempfile
import time

def main():
# open temporary file (it automatically deleted when it is closed)
# `Popen` requires `f.fileno()` so `SpooledTemporaryFile` adds nothing here
f = tempfile.TemporaryFile()

# start process, redirect stdout
p = subprocess.Popen(["top"], stdout=f)

# wait 2 seconds
time.sleep(2)

# kill process
#NOTE: if it doesn't kill the process then `p.wait()` blocks forever
p.terminate()
p.wait() # wait for the process to terminate otherwise the output is garbled

# print saved output
f.seek(0) # rewind to the beginning of the file
print f.read(),
f.close()

if __name__=="__main__":
main()

类似尾部的解决方案，只打印输出的一部分

您可以在另一个线程中读取进程输出并保存队列中所需的最后一行数：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

import collections
import subprocess
import time
import threading

def read_output(process, append):
for line in iter(process.stdout.readline,""):
append(line)

def main():
# start process, redirect stdout
process = subprocess.Popen(["top"], stdout=subprocess.PIPE, close_fds=True)
try:
# save last `number_of_lines` lines of the process output
number_of_lines = 200
q = collections.deque(maxlen=number_of_lines) # atomic .append()
t = threading.Thread(target=read_output, args=(process, q.append))
t.daemon = True
t.start()

#
time.sleep(2)
finally:
process.terminate() #NOTE: it doesn't ensure the process termination

# print saved lines
print ''.join(q)

if __name__=="__main__":
main()

此变体要求q.append()为原子操作。否则输出可能已损坏。

signal.alarm()解决方案

您可以使用signal.alarm()在指定的超时后调用process.terminate()，而不是在另一个线程中读取。虽然它可能与subprocess模块不能很好地交互。基于@Alex Martelli的回答：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

import collections
import signal
import subprocess

class Alarm(Exception):
pass

def alarm_handler(signum, frame):
raise Alarm

def main():
# start process, redirect stdout
process = subprocess.Popen(["top"], stdout=subprocess.PIPE, close_fds=True)

# set signal handler
signal.signal(signal.SIGALRM, alarm_handler)
signal.alarm(2) # produce SIGALRM in 2 seconds

try:
# save last `number_of_lines` lines of the process output
number_of_lines = 200
q = collections.deque(maxlen=number_of_lines)
for line in iter(process.stdout.readline,""):
q.append(line)
signal.alarm(0) # cancel alarm
except Alarm:
process.terminate()
finally:
# print saved lines
print ''.join(q)

if __name__=="__main__":
main()

此方法仅适用于* nix系统。如果process.stdout.readline()没有返回，它可能会阻止。

threading.Timer解决方案

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

import collections
import subprocess
import threading

def main():
# start process, redirect stdout
process = subprocess.Popen(["top"], stdout=subprocess.PIPE, close_fds=True)

# terminate process in timeout seconds
timeout = 2 # seconds
timer = threading.Timer(timeout, process.terminate)
timer.start()

# save last `number_of_lines` lines of the process output
number_of_lines = 200
q = collections.deque(process.stdout, maxlen=number_of_lines)
timer.cancel()

# print saved lines
print ''.join(q),

if __name__=="__main__":
main()

这种方法也适用于Windows。在这里，我使用process.stdout作为可迭代的;它可能会引入额外的输出缓冲，如果不需要，可以切换到iter(process.stdout.readline,"")方法。如果进程没有在process.terminate()上终止，则脚本会挂起。

没有线程，没有信号解决方案

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

import collections
import subprocess
import sys
import time

def main():
args = sys.argv[1:]
if not args:
args = ['top']

# start process, redirect stdout
process = subprocess.Popen(args, stdout=subprocess.PIPE, close_fds=True)

# save last `number_of_lines` lines of the process output
number_of_lines = 200
q = collections.deque(maxlen=number_of_lines)

timeout = 2 # seconds
now = start = time.time()
while (now - start) < timeout:
line = process.stdout.readline()
if not line:
break
q.append(line)
now = time.time()
else: # on timeout
process.terminate()

# print saved lines
print ''.join(q),

if __name__=="__main__":
main()

此变体既不使用线程也不使用信号，但它会在终端中产生乱码输出。如果process.stdout.readline()阻止，它将阻止。

相关讨论

而不是使用"顶部"我建议使用"ps"，它会给你相同的信息，但只有一次，而不是每秒一次。

你还需要使用ps的一些标志，我倾向于使用"ps aux"

相关讨论

事实上，如果你填写输出缓冲区，你会得到一些答案。因此，一种解决方案是使用大垃圾输出填充缓冲区(?6000字符，bufsize = 1)。

比方说，你有一个在sys.stdout上写的python脚本，而不是top。

1
2
3
4

GARBAGE='.
'
sys.stdout.write(valuable_output)
sys.stdout.write(GARBAGE*3000)

在启动器端，而不是简单的process.readline()：

1
2
3
4
5

GARBAGE='.
'
line=process.readline()
while line==GARBAGE:
line=process.readline()

确定它有点脏，因为2000依赖于子进程实现，但它工作正常并且非常简单。设置除bufsize = 1之外的任何东西都会让事情变得更糟。

(J.F. Sebastian你的代码工作得很好，我认为它比我的解决方案更好=))

我用另一种方式解决了它。

而不是直接在终端上输出我把它变成文件"tmp_file"：

1	top >> tmp_file

然后我使用工具"剪切"使其输出"最高输出"作为过程的值

1	cat tmp_file

它做了我想要它做的事。这是最终的代码：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

import os
import subprocess
import time

subprocess.Popen("top >> tmp_file",shell = True)

time.sleep(1)

os.popen("killall top")

process = os.popen("cat tmp_file").read()

os.popen("rm tmp_file")

print process

# Thing better than nothing =)

非常感谢你们的帮助

相关讨论

我会做的，而不是这种方法，是检查你试图从中获取信息的程序，并确定该信息的最终来源。它可以是API调用或设备节点。然后，编写一些从同一个源获取它的python。这消除了"刮""熟"数据的问题和开销。