live output from subprocess command
我正在使用python脚本作为流体动力学代码的驱动程序。当运行模拟时,我使用
有没有办法既存储输出(用于记录和错误检查),还产生实时流输出?
我的代码的相关部分:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | ret_val = subprocess.Popen( run_command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True ) output, errors = ret_val.communicate() log_file.write(output) print output if( ret_val.returncode ): print"RUN failed %s " % (errors) success = False if( errors ): log_file.write(" %s " % errors) |
最初我正在管道
编辑:
临时解决方案:
1 2 3 | ret_val = subprocess.Popen( run_command, stdout=log_file, stderr=subprocess.PIPE, shell=True ) while not ret_val.poll(): log_file.flush() |
然后,在另一个终端中,运行
您有两种方法可以通过从
1 2 3 4 5 6 7 | import subprocess import sys with open('test.log', 'w') as f: # replace 'w' with 'wb' for Python 3 process = subprocess.Popen(your_command, stdout=subprocess.PIPE) for c in iter(lambda: process.stdout.read(1), ''): # replace '' with b'' for Python 3 sys.stdout.write(c) f.write(c) |
要么
1 2 3 4 5 6 7 | import subprocess import sys with open('test.log', 'w') as f: # replace 'w' with 'wb' for Python 3 process = subprocess.Popen(your_command, stdout=subprocess.PIPE) for line in iter(process.stdout.readline, ''): # replace '' with b'' for Python 3 sys.stdout.write(line) f.write(line) |
或者,您可以创建
1 2 3 4 5 6 7 8 9 10 11 12 13 | import io import time import subprocess import sys filename = 'test.log' with io.open(filename, 'wb') as writer, io.open(filename, 'rb', 1) as reader: process = subprocess.Popen(command, stdout=writer) while process.poll() is None: sys.stdout.write(reader.read()) time.sleep(0.5) # Read the remaining sys.stdout.write(reader.read()) |
这样,您将在
文件方法的唯一优点是您的代码不会阻塞。因此,您可以在此期间执行任何操作,并以非阻塞方式从
执行摘要(或"tl; dr"版本):当最多只有一个
可能是时候解释一下
(警告:这是针对Python 2.x,尽管3.x类似;而且我对Windows变体很模糊。我更了解POSIX的东西。)
你可以提供:
最简单的情况(没有管道)
如果你没有重定向(将所有三个保留为默认的
仍然容易的情况:一个管道
如果只重定向一个流,
假设您要提供一些
1 2 3 | proc = subprocess.Popen(cmd, stdin=subprocess.PIPE) proc.stdin.write('here, have some data ') # etc |
或者你可以将stdin数据传递给
假设您要捕获
1 | for line in proc.stdout: |
或者,再次,您可以使用
如果您只想捕获
在事情变得艰难之前还有一个技巧。假设你想捕获
1 | proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT) |
在这种情况下,
困难案例:两个或更多管道
当您想要使用至少两个管道时,所有问题都会出现。事实上,
1 2 3 4 5 | def communicate(self, input=None): ... # Optimization: If we are only using one pipe, or no pipe at # all, using select() or threads is unnecessary. if [self.stdin, self.stdout, self.stderr].count(None) >= 2: |
但是,唉,这里我们已经制作了至少两个,也许三个不同的管道,所以
在Windows上,这使用
在POSIX上,如果可用,则使用
这里需要线程或轮询/选择以避免死锁。例如,假设我们已将所有三个流重定向到三个单独的管道。进一步假设在写入过程暂停之前,有多少数据可以填充到管道中,等待读取过程从另一端"清理"管道。我们将这个小限制设置为单个字节,仅用于说明。 (这实际上是如何工作的,除了限制远大于一个字节。)
如果父(Python)进程尝试写几个字节 - 比如
'
同时,假设子进程决定打印一个友好的"你好!不要恐慌!"问候。
现在我们陷入困境:Python进程处于睡眠状态,等待完成说"go",子进程也处于睡眠状态,等待完成说"你好!不要恐慌!"。
如果要在两个不同的管道上读取
演示
我承诺证明,未重定向,Python
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | from cStringIO import StringIO import os import subprocess import sys def show1(): print 'start show1' save = sys.stdout sys.stdout = StringIO() print 'sys.stdout being buffered' proc = subprocess.Popen(['echo', 'hello']) proc.wait() in_stdout = sys.stdout.getvalue() sys.stdout = save print 'in buffer:', in_stdout def show2(): print 'start show2' save = sys.stdout sys.stdout = open(os.devnull, 'w') print 'after redirect sys.stdout' proc = subprocess.Popen(['echo', 'hello']) proc.wait() sys.stdout = save show1() show2() |
运行时:
1 2 3 4 5 6 7 | $ python out.py start show1 hello in buffer: sys.stdout being buffered start show2 hello |
请注意,如果添加
(如果重定向Python的文件描述符-1,子进程将遵循该重定向。
好。
我们也可以使用默认文件迭代器来读取stdout,而不是使用带有readline()的iter构造。
1 2 3 4 5 | import subprocess import sys process = subprocess.Popen(your_command, stdout=subprocess.PIPE) for line in process.stdout: sys.stdout.write(line) |
如果您能够使用第三方库,您可以使用类似
一个好但"重量级"的解决方案是使用Twisted - 见底部。
如果你愿意只使用stdout那些东西应该工作:
1 2 3 4 5 6 7 8 9 10 | import subprocess import sys popenobj = subprocess.Popen(["ls","-Rl"], stdout=subprocess.PIPE) while not popenobj.poll(): stdoutdata = popenobj.stdout.readline() if stdoutdata: sys.stdout.write(stdoutdata) else: break print"Return code", popenobj.returncode |
(如果你使用read(),它会尝试读取整个"文件",这是无用的,我们真正可以使用的是读取管道中所有数据的东西)
人们也可能试图通过线程来解决这个问题,例如:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | import subprocess import sys import threading popenobj = subprocess.Popen("ls", stdout=subprocess.PIPE, shell=True) def stdoutprocess(o): while True: stdoutdata = o.stdout.readline() if stdoutdata: sys.stdout.write(stdoutdata) else: break t = threading.Thread(target=stdoutprocess, args=(popenobj,)) t.start() popenobj.wait() t.join() print"Return code", popenobj.returncode |
现在我们可以通过两个线程来添加stderr。
但请注意,子进程文档不鼓励直接使用这些文件,并建议使用
更复杂的解决方案是使用Twisted,如下所示:https://twistedmatrix.com/documents/11.1.0/core/howto/process.html
使用Twisted执行此操作的方法是使用
我尝试的所有上述解决方案都无法分离stderr和stdout输出(多个管道)或者当OS管道缓冲区已满时永远被阻塞,这在您运行的命令输出太快时发生(在python上有此警告) poll()子流程手册)。我找到的唯一可靠的方法是通过select,但这只是一个posix解决方案:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | import subprocess import sys import os import select # returns command exit status, stdout text, stderr text # rtoutput: show realtime output while running def run_script(cmd,rtoutput=0): p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) poller = select.poll() poller.register(p.stdout, select.POLLIN) poller.register(p.stderr, select.POLLIN) coutput='' cerror='' fdhup={} fdhup[p.stdout.fileno()]=0 fdhup[p.stderr.fileno()]=0 while sum(fdhup.values()) < len(fdhup): try: r = poller.poll(1) except select.error, err: if err.args[0] != EINTR: raise r=[] for fd, flags in r: if flags & (select.POLLIN | select.POLLPRI): c = os.read(fd, 1024) if rtoutput: sys.stdout.write(c) sys.stdout.flush() if fd == p.stderr.fileno(): cerror+=c else: coutput+=c else: fdhup[fd]=1 return p.poll(), coutput.strip(), cerror.strip() |
为什么不直接将
1 2 3 4 5 6 7 8 9 10 11 | import sys import subprocess class SuperFile(open.__class__): def write(self, data): sys.stdout.write(data) super(SuperFile, self).write(data) f = SuperFile("log.txt","w+") process = subprocess.Popen(command, stdout=f, stderr=f) |
看起来行缓冲输出对您有用,在这种情况下,类似下面的内容可能适用。 (警告:它未经测试。)这只会实时给出子进程的标准输出。如果你想实时同时使用stderr和stdout,你将不得不用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | proc = subprocess.Popen(run_command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True) while proc.poll() is None: line = proc.stdout.readline() print line log_file.write(line + ' ') # Might still be data on stdout at this point. Grab any # remainder. for line in proc.stdout.read().split(' '): print line log_file.write(line + ' ') # Do whatever you want with proc.stderr here... |
解决方案1:实时同时记录
一种简单的解决方案,可以实时同时记录
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | import subprocess as sp from concurrent.futures import ThreadPoolExecutor def log_popen_pipe(p, pipe_name): while p.poll() is None: line = getattr(p, pipe_name).readline() log_file.write(line) with sp.Popen(my_cmd, stdout=sp.PIPE, stderr=sp.PIPE, text=True) as p: with ThreadPoolExecutor(2) as pool: r1 = pool.submit(log_popen_pipe, p, 'stdout') r2 = pool.submit(log_popen_pipe, p, 'stderr') r1.result() r2.result() |
解决方案2:创建一个迭代返回
在这里,我们创建一个函数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | import subprocess as sp from queue import Queue, Empty from concurrent.futures import ThreadPoolExecutor def enqueue_output(file, queue): for line in iter(file.readline, ''): queue.put(line) file.close() def read_popen_pipes(p): with ThreadPoolExecutor(2) as pool: q_stdout, q_stderr = Queue(), Queue() pool.submit(enqueue_output, p.stdout, q_stdout) pool.submit(enqueue_output, p.stderr, q_stderr) while True: out_line = err_line = '' try: out_line = q_stdout.get_nowait() err_line = q_stderr.get_nowait() except Empty: pass yield (out_line, err_line) if p.poll() is not None: break with sp.Popen(my_cmd, stdout=sp.PIPE, stderr=sp.PIPE, text=True) as p: for out_line, err_line in read_popen_pipes(p): print(out_line, end='') print(err_line, end='') return p.poll() |
没有一个Pythonic解决方案适合我。
事实证明,
因此,我使用
1 | subprocess.run('./my_long_running_binary 2>&1 | tee -a my_log_file.txt && exit ${PIPESTATUS}', shell=True, check=True, executable='/bin/bash') |
如果您已经使用
如果我省略了
然而,unbuffer吞下断言的退出状态(SIG Abort)......
我认为
但是,从
我建议的解决方案是为stdout和stderr提供文件 - 并读取文件的内容,而不是从死锁
以下是一个示例用法:
1 2 3 4 5 6 7 | try: with ProcessRunner(('python', 'task.py'), env=os.environ.copy(), seconds_to_wait=0.01) as process_runner: for out in process_runner: print(out) catch ProcessError as e: print(e.error_message) raise |
这是源代码,可以使用尽可能多的注释来解释它的作用:
如果您使用的是python 2,请确保首先从pypi安装最新版本的subprocess32软件包。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 | import os import sys import threading import time import tempfile import logging if os.name == 'posix' and sys.version_info[0] < 3: # Support python 2 import subprocess32 as subprocess else: # Get latest and greatest from python 3 import subprocess logger = logging.getLogger(__name__) class ProcessError(Exception): """Base exception for errors related to running the process""" class ProcessTimeout(ProcessError): """Error that will be raised when the process execution will exceed a timeout""" class ProcessRunner(object): def __init__(self, args, env=None, timeout=None, bufsize=-1, seconds_to_wait=0.25, **kwargs): """ Constructor facade to subprocess.Popen that receives parameters which are more specifically required for the Process Runner. This is a class that should be used as a context manager - and that provides an iterator for reading captured output from subprocess.communicate in near realtime. Example usage: try: with ProcessRunner(('python', task_file_path), env=os.environ.copy(), seconds_to_wait=0.01) as process_runner: for out in process_runner: print(out) catch ProcessError as e: print(e.error_message) raise :param args: same as subprocess.Popen :param env: same as subprocess.Popen :param timeout: same as subprocess.communicate :param bufsize: same as subprocess.Popen :param seconds_to_wait: time to wait between each readline from the temporary file :param kwargs: same as subprocess.Popen """ self._seconds_to_wait = seconds_to_wait self._process_has_timed_out = False self._timeout = timeout self._process_done = False self._std_file_handle = tempfile.NamedTemporaryFile() self._process = subprocess.Popen(args, env=env, bufsize=bufsize, stdout=self._std_file_handle, stderr=self._std_file_handle, **kwargs) self._thread = threading.Thread(target=self._run_process) self._thread.daemon = True def __enter__(self): self._thread.start() return self def __exit__(self, exc_type, exc_val, exc_tb): self._thread.join() self._std_file_handle.close() def __iter__(self): # read all output from stdout file that subprocess.communicate fills with open(self._std_file_handle.name, 'r') as stdout: # while process is alive, keep reading data while not self._process_done: out = stdout.readline() out_without_trailing_whitespaces = out.rstrip() if out_without_trailing_whitespaces: # yield stdout data without trailing yield out_without_trailing_whitespaces else: # if there is nothing to read, then please wait a tiny little bit time.sleep(self._seconds_to_wait) # this is a hack: terraform seems to write to buffer after process has finished out = stdout.read() if out: yield out if self._process_has_timed_out: raise ProcessTimeout('Process has timed out') if self._process.returncode != 0: raise ProcessError('Process has failed') def _run_process(self): try: # Start gathering information (stdout and stderr) from the opened process self._process.communicate(timeout=self._timeout) # Graceful termination of the opened process self._process.terminate() except subprocess.TimeoutExpired: self._process_has_timed_out = True # Force termination of the opened process self._process.kill() self._process_done = True @property def return_code(self): return self._process.returncode |
基于以上所有我建议略微修改版本(python3):
- while循环调用readline(建议的iter解决方案似乎永远阻止我 - Python 3,Windows 7)
-
结构化,因此在轮询返回not-
None 之后不需要复制读取数据的处理 - stderr通过管道传输到stdout,因此读取了两个输出输出
- 添加了代码以获取cmd的退出值。
码:
1 2 3 4 5 6 7 8 9 10 11 12 13 | import subprocess proc = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, universal_newlines=True) while True: rd = proc.stdout.readline() print(rd, end='') # and whatever you want to do... if not rd: # EOF returncode = proc.poll() if returncode is not None: break time.sleep(0.1) # cmd closed stdout, but not exited yet # You may want to check on ReturnCode here |
与之前的答案类似,但以下解决方案适用于我在Windows上使用Python3提供实时打印和登录的常用方法(getting-realtime-output-using-python):
1 2 3 4 5 6 7 8 9 10 11 12 13 | def print_and_log(command, logFile): with open(logFile, 'wb') as f: command = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True) while True: output = command.stdout.readline() if not output and command.poll() is not None: f.close() break if output: f.write(output) print(str(output.strip(), 'utf-8'), flush=True) return command.poll() |
除了所有这些答案,一个简单的方法也可以如下:
1 2 3 4 5 6 7 8 9 | process = subprocess.Popen(your_command, stdout=subprocess.PIPE) while process.stdout.readable(): line = process.stdout.readline() if not line: break print(line.strip()) |
只要它是可读的就循环通过可读流,如果它得到一个空结果,则停止它。
这里的关键是
希望这有助于某人。
这是我在其中一个项目中使用的课程。它将子进程的输出重定向到日志。起初我尝试简单地覆盖写入方法,但这不起作用,因为子进程永远不会调用它(重定向发生在filedescriptor级别)。所以我使用自己的管道,类似于在子进程模块中完成的管道。这样做的好处是可以将所有日志记录/打印逻辑封装在适配器中,您只需将记录器的实例传递给
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | class LogAdapter(threading.Thread): def __init__(self, logname, level = logging.INFO): super().__init__() self.log = logging.getLogger(logname) self.readpipe, self.writepipe = os.pipe() logFunctions = { logging.DEBUG: self.log.debug, logging.INFO: self.log.info, logging.WARN: self.log.warn, logging.ERROR: self.log.warn, } try: self.logFunction = logFunctions[level] except KeyError: self.logFunction = self.log.info def fileno(self): #when fileno is called this indicates the subprocess is about to fork => start thread self.start() return self.writepipe def finished(self): """If the write-filedescriptor is not closed this thread will prevent the whole program from exiting. You can use this method to clean up after the subprocess has terminated.""" os.close(self.writepipe) def run(self): inputFile = os.fdopen(self.readpipe) while True: line = inputFile.readline() if len(line) == 0: #no new data was added break self.logFunction(line.strip()) |
如果您不需要日志记录但只想使用