关于python：在进程运行时不断打印Subprocess输出

Constantly print Subprocess output while process is running

要从我的python脚本启动程序，我使用以下方法：

1
2
3
4
5
6
7
8
9

def execute(command):
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
output = process.communicate()[0]
exitCode = process.returncode

if (exitCode == 0):
return output
else:
raise ProcessException(command, exitCode, output)

因此，当我启动一个类似于Process.execute("mvn clean install")的进程时，我的程序会一直等到该进程完成，然后我才能得到程序的完整输出。如果我正在运行一个需要一段时间才能完成的进程，这会很烦人。

我可以让我的程序一行一行地写进程输出吗？可以在进程输出在循环中完成之前对其进行轮询吗？

** [编辑]抱歉，我在发布这个问题之前没有很好地搜索。线程实际上是关键。在此处找到一个示例，演示如何执行此操作：**线程中的python subprocess.popen

相关讨论

当命令输出行时，可以使用ITER来处理它们：lines = iter(fd.readline,"")。下面是一个完整的例子，展示了一个典型的用例(感谢@jfs的帮助)：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

from __future__ import print_function # Only Python 2.x
import subprocess

def execute(cmd):
popen = subprocess.Popen(cmd, stdout=subprocess.PIPE, universal_newlines=True)
for stdout_line in iter(popen.stdout.readline,""):
yield stdout_line
popen.stdout.close()
return_code = popen.wait()
if return_code:
raise subprocess.CalledProcessError(return_code, cmd)

# Example
for path in execute(["locate","a"]):
print(path, end="")

相关讨论

我尝试过这段代码(运行一个需要花费大量时间的程序)，可以确认它在接收到行时输出行，而不是等待执行完成。这是我的最佳答案。
注意：在python 3中，可以使用for line in popen.stdout: print(line.decode(), end='')。要同时支持python 2和3，请使用bytes-literal:b''，否则lines_iterator永远不会在python 3上结束。
这种方法的问题在于，如果进程暂停一点而不向stdout写入任何内容，那么就没有更多的输入要读取。您将需要一个循环来检查流程是否已完成。我在python 2.7上使用subprocess32尝试了这个方法
哈尔：错了。当子进程死后，循环结束(在EOF上)。无需检查进程是否处于活动状态。
除非在进程运行时从管道中读取数据，否则不要使用PIPE，否则子进程可能挂起(引入stderr=PIPE的编辑错误)。要读取多个管道，需要更复杂的代码(threads，async.io)。
它应该能工作。要打磨它，可以添加bufsize=1(它可以提高python 2的性能)，显式关闭popen.stdout管道(不需要等待垃圾收集来处理)，并提升subprocess.CalledProcessError(如check_call())，check_output())。python 2和python 3上的print语句不同：可以使用SoftSpace Hack print line,(注意：逗号)，避免像代码一样将所有新行翻倍，并在python 3上传递universal_newlines=True，以获取文本而不是字节相关的答案。
谢谢，@j.f.sebastian，你所有的提议现在看起来都不错！
令人敬畏的片段
@托克兰，我想我错了。它看起来好像是在没有读取所有stdout的情况下结束的，但实际上存在stderr，整个进程以错误结束，这就是为什么到达eof的原因。
但是，如果它将python脚本作为子进程调用，并使用time.sleep()，则效果非常好。行的迭代将阻塞。
@宾章：你能把你正在运行的脚本上传到某个地方检查吗？
@tokland child_process.py：``import time import sys while true:print'hello_world'time.sleep(1)sys.stdout.flush()``在上述调用过程中：``` for path in execute(["python"，"child_thread.py"])：print(path，end=").```
@tokland如果没有sys.stdout.flush()，行的迭代将被阻塞。如果没有sys.stdout.flush()和time.sleep，迭代将不会阻塞
@binzhang这不是一个错误，默认情况下，在python脚本(也适用于许多UNIX工具)上缓冲stdout。试试execute(["python","-u","child_thread.py"])。更多信息：stackoverflow.com/questions/14258500/&hellip；
是的，谢谢你，"-u"说得通。
@binzhang：相关：python c程序的子进程挂在"for line in iter"(如果没有-u，grep的--line-buffered类的一般情况下很有用)
你能解释一下print()end=""的第二个论点吗？当我尝试运行它python 3.5时，这被认为是一个语法错误，我不知道它的作用是什么。
@类似武器我很困惑。这应该适用于3.5。在文档中输入更多信息：docs.python.org/3/library/functions.html print
在构造popen时，还应该设置stderr=subprocess.STDOUT，以确保不会遗漏任何错误消息。
@tvt173有时需要捕获stderr，有时在终端上显示它是可以的。不管怎样，不知道同时从两个人身上读出来最好的方法是什么，@j.f.sebastian？
@托克兰，非常圆滑的代码。在看到你的代码之后，我不得不读了很多关于"yield"和iterables的文章，但我仍然不能真正地把它们放在一起。您能更详细地解释一下您的代码是如何工作的吗？在这种情况下，您使用yield的直觉是什么？
这不适用于scp命令。试图发送一个文件，但我没有得到输出，但它似乎对tree命令有效。
@克里克：我想那是因为scp自动检测是否有终端。SO中有一些信息，例如：stackoverflow.com/questions/3890809/&hellip；
@托克兰谢谢你的参考。我有一个使用rsync的有效实现。askubuntu.com/questions/44059/progress-bar-for-scp-command执行此操作时，我能够获得上载的进度更新。
@托克兰，试着在子进程中用一个while循环来运行这个，这很有效。但是，如果我输入了sleep.time(1)，那么循环每秒只运行一次，那么在进程完成之前，我什么也不会得到打印。有什么想法吗？

好吧，我通过使用这个问题中的一个片段，在子进程运行时截取stdout，成功地解决了没有线程的问题(感谢您对为什么使用线程更好的任何建议)。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

def execute(command):
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)

# Poll process for new output until finished
while True:
nextline = process.stdout.readline()
if nextline == '' and process.poll() is not None:
break
sys.stdout.write(nextline)
sys.stdout.flush()

output = process.communicate()[0]
exitCode = process.returncode

if (exitCode == 0):
return output
else:
raise ProcessException(command, exitCode, output)

相关讨论

合并ifischer和tokland的代码非常有效(我不得不将print line,改为sys.stdout.write(nextline); sys.stdout.flush()。否则，每两行打印一次。再说一次，这是使用IPython的笔记本界面，所以可能发生了其他事情——不管怎样，显式地调用flush()是可行的。
先生，你是我的救命恩人！！真奇怪，这种东西不是在图书馆里内置的。因为如果我写cliapp，我想立即显示循环中正在处理的所有内容。S…
此解决方案是否可以修改为不断打印输出和错误？如果我将stderr=subprocess.STDOUT改为stderr=subprocess.PIPE，然后从循环内调用process.stderr.readline()，我似乎遇到了subprocess模块文档中警告的死锁。
@我想你要找的是stdout=subprocess.PIPE,stderr=subprocess.STDOUT，它捕获了stderr，我相信(但我没有测试过)它也捕获了stdin。
感谢您等待退出代码。不知道怎么解决
@vitalyisaev：不需要对循环中的退出状态进行轮询，可以在循环结束后使用rc = process.wait()进行轮询。
大家好，有一个小小的疑问，我们正在打破进程。poll()不是none，意味着每次它轮询输出时，它都将变成none？
我们正在使用stdout=subprocess.pipe和stderr=subprocess.stdout，像这样，有人能给我一个很好的参考来研究这些，使用这些不同的组合会发生什么，例如，输出stdout=subprocess.stdout、stderr=subprocess.pipe等…
execute("ls")永远运行，一直在策划''。我在python3上运行了这个命令，需要在这里添加str()nextline = str(process.stdout.readline())。你知道为什么它不想退出吗？
@f1sher:readline可能返回的是b""，而不是上面代码中使用的""。尝试使用if nextline == b"" and...。
@如果是我，非常感谢！！！！！！！
问题在于，它在等待输出时会阻塞，因此在程序在stdout/stderr上输出另一行之前，您无法工作(如查找超时或w/e)。对于完整的解决方案，您可以使用线程/队列a la stackoverflow.com/questions/375427/&hellip；
如何在不创建滚动条和相同输出的情况下工作？它似乎只是标记和更新屏幕，但是如果我选择要生成哪一行，那么它就会开始一次又一次地滚动选定的行。如果我让它运行，那么它只会"更新"现有的屏幕。任何见解都会受到极大的赞赏。
对于输出速度极快的程序，这种方法不是打印所有行。如果一个程序可以产生非常快的输出，例如在下次调用readline之前打印多行，那么这些行将被丢弃。

在python 3中刷新子进程的stdout缓冲区后，立即逐行打印其输出：

1
2
3
4
5
6
7
8

from subprocess import Popen, PIPE, CalledProcessError

with Popen(cmd, stdout=PIPE, bufsize=1, universal_newlines=True) as p:
for line in p.stdout:
print(line, end='') # process line here

if p.returncode != 0:
raise CalledProcessError(p.returncode, p.args)

注意：您不需要p.poll()——当到达eof时循环结束。而且您不需要iter(p.stdout.readline, '')--预读bug是在python 3中修复的。

另请参见python:read streaming input from subprocess.communication()。

相关讨论

这个解决方案对我有效。上面给出的被接受的解决方案只是为我打印空白行。
我必须添加sys.stdout.flush()才能立即获得打印结果。
@codename:在父级中不需要sys.stdout.flush()——如果没有重定向到文件/管道，那么stdout是行缓冲的，因此打印line会自动刷新缓冲区。您也不需要在孩子身上使用sys.stdout.flush()--而是通过-u命令行选项。
@J.F.Sebastian抱歉，我应该提到我正在将输出重定向到一个文件。
@代码名：如果它被重定向到一个文件，那么为什么需要sys.stdout.flush()？你在用tail -f监视文件吗？你有没有考虑过用check_call(cmd, stdout=file_object)来代替？
@J.F.塞巴斯蒂安-是的，我想有能力对文件进行跟踪。另外，我希望能够通过命令行以通常的方式使用>操作符来转储输出，并且没有编码的文件名。
@代号：如果您想使用>，那么运行python -u your-script.py > some-file。注意：我上面提到的-u选项(不需要使用sys.stdout.flush())。
我有几个限制条件。并不是每个使用我的脚本的人都会将python命令别名为python 3二进制文件，我希望尽可能简单地让它们使用。sys.stdout.flush()有什么我应该关注的主要缺点吗？
@codename:用sys.stdout.flush()喷洒代码可能会影响性能，而且很容易出错。我不知道-u与python命令的别名是什么相关。
是的，您对-u的看法是正确的，但这意味着所有用户在运行时都必须添加一个arg，对吧…
@代号：不，不是那个意思。我已按说明回答了问题。如何满足额外的需求取决于具体情况。如果您在所有情况下都想要未缓冲的输出，请用未缓冲的对象替换sys.stdout，或者重定向它。为了避免修改代码，可以创建一个shell脚本，为python可执行文件设置适当的命令行参数和环境变量。作为一个快速而肮脏的黑客，您可以将flush=True传递给print()函数。
对于我的例子(运行在Jenkins下的基于Python的构建脚本)，这恰好是这个页面上最好的答案。但我认为在末尾添加代码来获取返回代码是值得的——return_code = p.wait()。
@mvidelgauz不需要调用p.wait()—它在with块的出口调用。使用p.returncode。
@J.F.Sebastian在我的例子中，p.returncode给了None—因此我的评论是：)(Windows上的python 3.6，Ubuntu上的python 3.4)
@mvidelgauz：除非发生异常，否则不能是with语句外的None。复制-按原样粘贴代码。代码的缩进在python中非常重要。
@J.F.塞巴斯蒂安，谢谢，我没有意识到‘P’存在于with之外，我在for之后加上它，但在with之内。

托克兰

尝试了您的代码，并针对3.4和Windows进行了更正dir.cmd是一个简单的dir命令，另存为cmd文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14

import subprocess
c ="dir.cmd"

def execute(command):
popen = subprocess.Popen(command, stdout=subprocess.PIPE,bufsize=1)
lines_iterator = iter(popen.stdout.readline, b"")
while popen.poll() is None:
for line in lines_iterator:
nline = line.rstrip()
print(nline.decode("latin"), end ="

",flush =True) # yield line

execute(c)

相关讨论

对于任何试图从python脚本获取stdout这个问题的答案的人，请注意python缓冲其stdout，因此可能需要一段时间才能看到stdout。

这可以通过在目标脚本中的每个stdout写入之后添加以下内容来纠正：

1	sys.stdout.flush()

相关讨论

在python中>=3.5使用subprocess.run对我有效：

1
2
3
4

import subprocess

cmd = 'echo foo; sleep 1; echo foo; sleep 2; echo foo'
subprocess.run(cmd, shell=True)

(在执行过程中获取输出也可以在没有shell=True的情况下工作)https://docs.python.org/3/library/subprocess.html subprocess.run

相关讨论

如果有人想在使用线程的同时读取stdout和stderr，我就想到了：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67

import threading
import subprocess
import Queue

class AsyncLineReader(threading.Thread):
def __init__(self, fd, outputQueue):
threading.Thread.__init__(self)

assert isinstance(outputQueue, Queue.Queue)
assert callable(fd.readline)

self.fd = fd
self.outputQueue = outputQueue

def run(self):
map(self.outputQueue.put, iter(self.fd.readline, ''))

def eof(self):
return not self.is_alive() and self.outputQueue.empty()

@classmethod
def getForFd(cls, fd, start=True):
queue = Queue.Queue()
reader = cls(fd, queue)

if start:
reader.start()

return reader, queue

process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
(stdoutReader, stdoutQueue) = AsyncLineReader.getForFd(process.stdout)
(stderrReader, stderrQueue) = AsyncLineReader.getForFd(process.stderr)

# Keep checking queues until there is no more output.
while not stdoutReader.eof() or not stderrReader.eof():
# Process all available lines from the stdout Queue.
while not stdoutQueue.empty():
line = stdoutQueue.get()
print 'Received stdout: ' + repr(line)

# Do stuff with stdout line.

# Process all available lines from the stderr Queue.
while not stderrQueue.empty():
line = stderrQueue.get()
print 'Received stderr: ' + repr(line)

# Do stuff with stderr line.

# Sleep for a short time to avoid excessive CPU use while waiting for data.
sleep(0.05)

print"Waiting for async readers to finish..."
stdoutReader.join()
stderrReader.join()

# Close subprocess' file descriptors.
process.stdout.close()
process.stderr.close()

print"Waiting for process to exit..."
returnCode = process.wait()

if returnCode != 0:
raise subprocess.CalledProcessError(returnCode, command)

我只想和大家分享这个问题，因为我最终在这个问题上试图做一些类似的事情，但是没有一个答案解决了我的问题。希望它能帮助别人！

注意，在我的用例中，一个外部进程终止了我们所使用的cx1〔5〕进程。

相关讨论

这个POC不断地从一个进程中读取输出，并且可以在需要时访问。只保留最后一个结果，所有其他输出都将被丢弃，从而防止管道从内存中增长：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

import subprocess
import time
import threading
import Queue

class FlushPipe(object):
def __init__(self):
self.command = ['python', './print_date.py']
self.process = None
self.process_output = Queue.LifoQueue(0)
self.capture_output = threading.Thread(target=self.output_reader)

def output_reader(self):
for line in iter(self.process.stdout.readline, b''):
self.process_output.put_nowait(line)

def start_process(self):
self.process = subprocess.Popen(self.command,
stdout=subprocess.PIPE)
self.capture_output.start()

def get_output_for_processing(self):
line = self.process_output.get()
print">>>" + line

if __name__ =="__main__":
flush_pipe = FlushPipe()
flush_pipe.start_process()

now = time.time()
while time.time() - now < 10:
flush_pipe.get_output_for_processing()
time.sleep(2.5)

flush_pipe.capture_output.join(timeout=0.001)
flush_pipe.process.kill()

PrimtTyDAT.Py

1
2
3
4
5
6
7

#!/usr/bin/env python
import time

if __name__ =="__main__":
while True:
print str(time.time())
time.sleep(0.01)

输出：您可以清楚地看到只有~2.5s间隔的输出，两者之间没有任何内容。

1
2
3
4

>>>1520535158.51
>>>1520535161.01
>>>1520535163.51
>>>1520535166.01

这里的答案都不能满足我所有的需要。

没有用于stdout的线程(也没有队列等)

不阻塞，因为我需要检查其他事情

根据需要使用pipe执行多项操作，例如流输出、写入日志文件并返回输出的字符串副本。

一点背景知识：我使用一个线程池执行器来管理一个线程池，每个线程都启动一个子进程并并发地运行它们。(在python2.7中，但这也适用于更新的3.x)。我不想将线程仅用于输出收集，因为我希望尽可能多的线程可用于其他用途(20个进程的池将使用40个线程仅用于运行；1个用于进程线程，1个用于stdout…如果需要stderr，我猜还有更多)

我剥离了很多异常，所以这是基于在生产中工作的代码。希望我没有在复制和粘贴中破坏它。另外，非常欢迎反馈！

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41

import time
import fcntl
import subprocess
import time

proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)

# Make stdout non-blocking when using read/readline
proc_stdout = proc.stdout
fl = fcntl.fcntl(proc_stdout, fcntl.F_GETFL)
fcntl.fcntl(proc_stdout, fcntl.F_SETFL, fl | os.O_NONBLOCK)

def handle_stdout(proc_stream, my_buffer, echo_streams=True, log_file=None):
"""A little inline function to handle the stdout business."""
# fcntl makes readline non-blocking so it raises an IOError when empty
try:
for s in iter(proc_stream.readline, ''): # replace '' with b'' for Python 3
my_buffer.append(s)

if echo_streams:
sys.stdout.write(s)

if log_file:
log_file.write(s)
except IOError:
pass

# The main loop while subprocess is running
stdout_parts = []
while proc.poll() is None:
handle_stdout(proc_stdout, stdout_parts)

# ...Check for other things here...
# For example, check a multiprocessor.Value('b') to proc.kill()

time.sleep(0.01)

# Not sure if this is needed, but run it again just to be sure we got it all?
handle_stdout(proc_stdout, stdout_parts)

stdout_str ="".join(stdout_parts) # Just to demo

我相信这里有额外的开销，但在我的情况下这不是一个问题。在功能上，它可以满足我的需要。我唯一没有解决的问题是为什么这对日志消息非常有效，但是我看到一些print消息稍后就会出现，而且一次都会出现。

为了回答最初的问题，IMO最好的方法就是将子进程stdout直接重定向到程序的stdout(可选地，可以对stderr执行相同的操作，如下例所示)

1 2	p = Popen(cmd, stdout=sys.stdout, stderr=sys.stderr) p.communicate()

相关讨论

这至少在python3.4中有效。

1
2
3
4
5

import subprocess

process = subprocess.Popen(cmd_list, stdout=subprocess.PIPE)
for line in process.stdout:
print(line.decode().strip())

相关讨论

在Python3.6中，我使用了：

1
2
3
4
5

import subprocess

cmd ="command"
output = subprocess.call(cmd, shell=True)
print(process)

相关讨论