关于io：python中subprocess.PIPE上的非阻塞读取

Non-blocking read on a subprocess.PIPE in python

我正在使用子进程模块启动子进程并连接到它的输出流(stdout)。我希望能够在其标准输出上执行非阻塞读取。有没有办法让.readline非阻塞或在我调用.readline之前检查流上是否有数据？我希望这是可移植的，或至少在Windows和Linux下工作。

这是我现在的工作方式(如果没有数据可用，它会在.readline上阻塞)：

1 2	p = subprocess.Popen('myprogram.exe', stdout = subprocess.PIPE) output_str = p.stdout.readline()

相关讨论

在这种情况下，fcntl，select，asyncproc将无济于事。

无论操作系统如何，无阻塞地读取流的可靠方法是使用Queue.get_nowait()：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

import sys
from subprocess import PIPE, Popen
from threading import Thread

try:
from queue import Queue, Empty
except ImportError:
from Queue import Queue, Empty # python 2.x

ON_POSIX = 'posix' in sys.builtin_module_names

def enqueue_output(out, queue):
for line in iter(out.readline, b''):
queue.put(line)
out.close()

p = Popen(['myprogram.exe'], stdout=PIPE, bufsize=1, close_fds=ON_POSIX)
q = Queue()
t = Thread(target=enqueue_output, args=(p.stdout, q))
t.daemon = True # thread dies with the program
t.start()

# ... do other things here

# read line without blocking
try: line = q.get_nowait() # or q.get(timeout=.1)
except Empty:
print('no output yet')
else: # got line
# ... do something with line

相关讨论

是的，这对我有用，我删除了很多。它包括良好实践，但并非总是必要的。 Python 3.x 2.X compat和close_fds可能会被省略，它仍然有效。但只要知道一切都做了什么，不要盲目地复制它，即使它只是有效！ (实际上最简单的解决方案是使用一个线程并像Seb那样做一个readline，Qeues只是获取数据的简单方法，还有其他的，线程就是答案！)
在线程内部，对out.readline的调用会阻塞线程和主线程，我必须等到readline返回，然后其他所有内容才会继续。有什么简单的方法吗？ (我正在从我的进程中读取多行，这也是另一个正在执行数据库和事物的.py文件)
@Justin：'out.readline'不会阻止它在另一个线程中执行的主线程。
close_fds绝对不是你想盲目复制到你的应用程序中的东西......
如果我无法关闭子进程，例如。由于例外？ stdout-reader线程不会死，python会挂起，即使主线程退出，不是吗？怎么可以解决这个问题呢？ python 2.x不支持杀死线程，更糟糕的是，不支持中断它们。 :((显然应该处理异常以确保子进程被关闭，但万一它不会，你能做什么？)
@naxa：notice daemon=True：如果退出主线程，python进程将不会挂起。
我在包shelljob pypi.python.org/pypi/shelljob中创建了一些友好的包装器
我在一个单独的线程中启动了Popen，所以我可以在这个线程中"忙等"，使用：.... t.start() while p.poll() == None: time.sleep(0.1)在这种情况下，我的GUI不会阻塞(我正在使用TKinter)。所以我也可以使用parent.after(100, self.consume)来模拟事件类型的轮询。在consume方法中，我最终使用q.get()方法从队列中检索数据。奇迹般有效！虽然有些人说你可以使用wait()或poll()与stdout PIPE组合进行死锁？
@ danger89：是的。如果您的父级在.wait()上被阻止但您的孩子等待您阅读其输出(不是您的情况，但GUI回调中的q.get()也不正确)，您可能会死锁。最简单的选择是在单独的线程中使用.communicate()。下面是一些代码示例，展示了如何在不"锁定"GUI的情况下从子进程读取输出：1。使用线程2.无线程(POSIX)。如果有什么不清楚，请问
这里close_fds的目的是什么？
@dashesy：避免泄漏父文件描述符。演示的行为是Python 3的默认行为。
@ J.F.Sebastian：有道理。这是唯一没有阻止它的解决方案，似乎在2.7没有其他办法(尝试选择，fcntl)！知道为什么我无法从sgid进程获取输出吗？它在终端工作得很好，但在这里我得到no output yet，这是不可能的。
@dashesy：select，fcntl应该适用于POSIX系统。如果你不明白为什么"没有输出"总是有可能给出给定的解决方案;问一个新问题。请务必阅读如何询问和stackoverflow.com/help/mcve
@ J.F.Sebastian对不起我的绝望尝试，我想你可能已经知道了答案，你看起来像你做的:)我下次会问一个新问题。 BTW，在可执行文件中使用setbuf(stdout, NULL)之后(大多数工作甚至选择和fcntl)就像魅力一样，我不知道为什么
没有刷新标准输出但我不会感到惊讶，如果它与它有关suid和selinux。
@dashesy：如果子进程'stdout被重定向到一个管道，那么它是块缓冲的(如果stdout是终端(tty)，它是行缓冲的)，用于用C编写的基于stdio的程序。参见Python C程序子进程挂起"for it in iter"
@ J.F.Sebastian不知道这一点，天真地我总是假设在tty中看到的行为，但现在它是有道理的。因此，这意味着智能应用程序应该查看stdout的类型，如果它们生成的东西是行并且可以在pipes / grep中使用。
你在enqueue_output中做out.close()的原因是什么？不应该是Popen对象的工作吗？
@zaphod：完成同样的原因with -statement用于普通文件：避免依赖难以理解的垃圾收集来释放资源，即使p.stdout在是垃圾收集：我更喜欢显式的简单和确定性out.close()(虽然我应该在线程中使用with out: - 它似乎适用于Python 2.7和Python 3)。
该解决方案不适用于可能代码系统(16-48核心系统)。 GIL在上下文切换中发挥作用。
如下所述，更好地使用非阻塞IO。
@AntonMedvedev：1。无论有多少CPU都没关系。问题是关于I / O(Python在阻止I / O操作期间发布GIL)。 2.在答案时，便携式解决方案无法替代stdlib中的线程。非阻塞IO是否比线程更好(无条件地)是有争议的。一种明智的方法是研究特定情况下的权衡。
你写'Windows和Linux'，这是否排除了OSX？我刚刚在OSX上尝试执行ffmpeg，这似乎有问题。我将更详细地测试这个，除非你告诉我这在OSX中不起作用。
@ P.R。：OP询问这些操作系统，这就是明确提到它们的原因。它也适用于OS X.
@nights："不适合我"并不是很有用。创建一个最小的代码示例，使用单词描述您期望得到什么以及您逐步获得什么，您是什么操作系统，Python版本并将其作为单独的问题发布。
不要为多处理和多处理切换线程。有这个答案的问题。子进程的终止导致stdout光标移动到我的屏幕的开头。
@jfs是非阻塞I / O必须是异步的吗？
@ uzay95我不确定你在问什么。这些概念密切相关。答案显示了如何在同步阻塞readline()调用之上实现非阻塞读取。该接口也是异步的(执行IO时可能会发生其他事情)。
我觉得我每个月左右偶然发现一次这个帖子......我仍然不知道为什么声称"fcntl在这种情况下不会帮助"。我现在必须使用fnctl设置os.O_NONBLOCK(使用os.read()，而不是readline())实现10-20次。它似乎通常按预期工作。
@cheshirekow是否支持Windows fcntl？ Python 2上的readline()是否支持非阻塞模式？
@jfs不确定windows ...你知道吗？这就是为什么你说它不会有帮助？另请注意，我用os.read()而不是readline()说。
os.read()在Windows上可用，但os.O_NONBLOCK被记录为特定于Unix，因此我不希望它在那里工作。
当我尝试使用提供的逻辑回答stderr数据无法得到它
看起来stderr数据也会转到stdout
@Karthi1234没有，stdout在回答管道。 stderr不在这里去管道。你的代码是另一回事。
有没有提到或使用docs.python.org/3/library/asyncio-api-index.html库的原因？这不好吗？
@CharlieParker向下滚动到我的另一个答案

我经常遇到类似的问题;我经常编写的Python程序需要能够执行一些主要功能，同时从命令行(stdin)接受用户输入。简单地将用户输入处理功能放在另一个线程中并不能解决问题，因为readline()会阻塞并且没有超时。如果主要功能已完成并且不再需要等待进一步的用户输入，我通常希望我的程序退出，但它不能，因为readline()仍然在等待一行的另一个线程中阻塞。我发现这个问题的解决方案是使用fcntl模块使stdin成为非阻塞文件：

1
2
3
4
5
6
7
8
9
10
11
12
13
14

import fcntl
import os
import sys

# make stdin a non-blocking file
fd = sys.stdin.fileno()
fl = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)

# user input handling thread
while mainThreadIsRunning:
try: input = sys.stdin.readline()
except: continue
handleInput(input)

在我看来，这比使用选择或信号模块解决这个问题要清晰一点，但是它再次只适用于UNIX ...

相关讨论

Python 3.4为异步IO - asyncio模块引入了新的临时API。

该方法类似于@Bryan Ward基于twisted的回答 - 定义协议，并在数据准备就绪后调用其方法：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

#!/usr/bin/env python3
import asyncio
import os

class SubprocessProtocol(asyncio.SubprocessProtocol):
def pipe_data_received(self, fd, data):
if fd == 1: # got stdout data (bytes)
print(data)

def connection_lost(self, exc):
loop.stop() # end loop.run_forever()

if os.name == 'nt':
loop = asyncio.ProactorEventLoop() # for subprocess' pipes on Windows
asyncio.set_event_loop(loop)
else:
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(loop.subprocess_exec(SubprocessProtocol,
"myprogram.exe","arg1","arg2"))
loop.run_forever()
finally:
loop.close()

请参阅文档中的"子流程"。

有一个高级接口asyncio.create_subprocess_exec()返回Process对象，允许使用StreamReader.readline()协同程序异步读取一行
(使用async / await Python 3.5+语法)：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

#!/usr/bin/env python3.5
import asyncio
import locale
import sys
from asyncio.subprocess import PIPE
from contextlib import closing

async def readline_and_kill(*args):
# start child process
process = await asyncio.create_subprocess_exec(*args, stdout=PIPE)

# read line (sequence of bytes ending with b'
') asynchronously
async for line in process.stdout:
print("got line:", line.decode(locale.getpreferredencoding(False)))
break
process.kill()
return await process.wait() # wait for the child process to exit

if sys.platform =="win32":
loop = asyncio.ProactorEventLoop()
asyncio.set_event_loop(loop)
else:
loop = asyncio.get_event_loop()

with closing(loop):
sys.exit(loop.run_until_complete(readline_and_kill(
"myprogram.exe","arg1","arg2")))

readline_and_kill()执行以下任务：

启动子进程，将其stdout重定向到管道
从子进程'stdout异步读取一行
杀死子进程
等它退出

如有必要，每个步骤都可以通过超时秒限制。

相关讨论

尝试使用asyncproc模块。例如：

1
2
3
4
5
6
7
8
9
10
11
12
13

import os
from asyncproc import Process
myProc = Process("myprogram.app")

while True:
# check to see if process has ended
poll = myProc.wait(os.WNOHANG)
if poll != None:
break
# print any new output
out = myProc.read()
if out !="":
print out

该模块负责S.Lott建议的所有线程。

相关讨论

您可以在Twisted中轻松完成此操作。根据您现有的代码库，这可能不是那么容易使用，但如果您正在构建一个扭曲的应用程序，那么这样的事情几乎变得微不足道。您创建一个ProcessProtocol类，并覆盖outReceived()方法。 Twisted(取决于所使用的反应器)通常只是一个大的select()循环，其中安装了回调以处理来自不同文件描述符(通常是网络套接字)的数据。所以outReceived()方法只是安装一个回调来处理来自STDOUT的数据。演示此行为的简单示例如下：

1
2
3
4
5
6
7
8
9
10

from twisted.internet import protocol, reactor

class MyProcessProtocol(protocol.ProcessProtocol):

def outReceived(self, data):
print data

proc = MyProcessProtocol()
reactor.spawnProcess(proc, './myprogram', ['./myprogram', 'arg1', 'arg2', 'arg3'])
reactor.run()

Twisted文档有一些很好的信息。

如果你围绕Twisted构建整个应用程序，它会与本地或远程的其他进程进行异步通信，就像这样非常优雅。另一方面，如果你的程序不是建立在Twisted之上，那么这实际上并没有那么有用。希望这对其他读者有帮助，即使它不适用于您的特定应用程序。

相关讨论

使用选择＆amp;读(1)。

1
2
3
4
5
6
7
8

import subprocess #no new requirements
def readAllSoFar(proc, retVal=''):
while (select.select([proc.stdout],[],[],0)[0]!=[]):
retVal+=proc.stdout.read(1)
return retVal
p = subprocess.Popen(['/bin/ls'], stdout=subprocess.PIPE)
while not p.poll():
print (readAllSoFar(p))

对于readline() - 如：

1
2
3
4
5
6
7
8
9
10

lines = ['']
while not p.poll():
lines = readAllSoFar(p, lines[-1]).split('
')
for a in range(len(lines)-1):
print a
lines = readAllSoFar(p, lines[-1]).split('
')
for a in range(len(lines)-1):
print a

相关讨论

一种解决方案是使另一个进程执行您对进程的读取，或者使进程的线程超时。

这是超时函数的线程版本：

http://code.activestate.com/recipes/473878/

但是，你需要阅读stdout，因为它正在进入？
另一种解决方案可能是将输出转储到文件并等待进程使用p.wait()完成。

1
2
3
4
5
6
7

f = open('myprogram_output.txt','w')
p = subprocess.Popen('myprogram.exe', stdout=f)
p.wait()
f.close()

str = open('myprogram_output.txt','r').read()

相关讨论

免责声明：这仅适用于龙卷风

您可以通过将fd设置为非阻塞来执行此操作，然后使用ioloop注册回调。我把它打包成一个名为tornado_subprocess的蛋，你可以通过PyPI安装它：

1	easy_install tornado_subprocess

现在你可以这样做：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

import tornado_subprocess
import tornado.ioloop

def print_res( status, stdout, stderr ) :
print status, stdout, stderr
if status == 0:
print"OK:"
print stdout
else:
print"ERROR:"
print stderr

t = tornado_subprocess.Subprocess( print_res, timeout=30, args=["cat","/etc/passwd" ] )
t.start()
tornado.ioloop.IOLoop.instance().start()

您也可以将它与RequestHandler一起使用

1
2
3
4
5
6
7
8
9

class MyHandler(tornado.web.RequestHandler):
def on_done(self, status, stdout, stderr):
self.write( stdout )
self.finish()

@tornado.web.asynchronous
def get(self):
t = tornado_subprocess.Subprocess( self.on_done, timeout=30, args=["cat","/etc/passwd" ] )
t.start()

相关讨论

现有的解决方案对我不起作用(详情如下)。最终工作的是使用read(1)实现readline(基于这个答案)。后者不会阻止：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

from subprocess import Popen, PIPE
from threading import Thread
def process_output(myprocess): #output-consuming thread
nextline = None
buf = ''
while True:
#--- extract line using read(1)
out = myprocess.stdout.read(1)
if out == '' and myprocess.poll() != None: break
if out != '':
buf += out
if out == '
':
nextline = buf
buf = ''
if not nextline: continue
line = nextline
nextline = None

#--- do whatever you want with line here
print 'Line is:', line
myprocess.stdout.close()

myprocess = Popen('myprogram.exe', stdout=PIPE) #output-producing process
p1 = Thread(target=process_output, args=(dcmpid,)) #output-consuming thread
p1.daemon = True
p1.start()

#--- do whatever here and then kill process and thread if needed
if myprocess.poll() == None: #kill process; will automatically stop thread
myprocess.kill()
myprocess.wait()
if p1 and p1.is_alive(): #wait for thread to finish
p1.join()

为什么现有解决方案不起作用：

需要readline的解决方案(包括基于Queue的解决方案)始终会阻止。杀死执行readline的线程很困难(不可能？)它只会在创建它的进程完成时被杀死，但不会在生成输出的进程被终止时被杀死。

正如aonnn所指出的，将低级别fcntl与高级别readline调用混合可能无法正常工作。

使用select.poll()很整洁，但根据python docs在Windows上不起作用。

使用第三方库似乎对此任务有些过分，并添加了其他依赖项。

相关讨论

此版本的非阻塞读取不需要特殊模块，并且可以在大多数Linux发行版中实现开箱即用。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

import os
import sys
import time
import fcntl
import subprocess

def async_read(fd):
# set non-blocking flag while preserving old flags
fl = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)
# read char until EOF hit
while True:
try:
ch = os.read(fd.fileno(), 1)
# EOF
if not ch: break
sys.stdout.write(ch)
except OSError:
# waiting for data be available on fd
pass

def shell(args, async=True):
# merge stderr and stdout
proc = subprocess.Popen(args, shell=False, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
if async: async_read(proc.stdout)
sout, serr = proc.communicate()
return (sout, serr)

if __name__ == '__main__':
cmd = 'ping 8.8.8.8'
sout, serr = shell(cmd.split())

我添加此问题来读取一些subprocess.Popen标准输出。
这是我的非阻塞读取解决方案：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

import fcntl

def non_block_read(output):
fd = output.fileno()
fl = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)
try:
return output.read()
except:
return""

# Use example
from subprocess import *
sb = Popen("echo test && sleep 1000", shell=True, stdout=PIPE)
sb.kill()

# sb.stdout.read() # <-- This will block
non_block_read(sb.stdout)
'test
'

相关讨论

这是我的代码，用于捕获子进程ASAP的每个输出，包括部分行。它以相同的顺序同时泵送stdout和stderr。

经过测试并正确使用Python 2.7 linux＆amp;视窗。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87

#!/usr/bin/python
#
# Runner with stdout/stderr catcher
#
from sys import argv
from subprocess import Popen, PIPE
import os, io
from threading import Thread
import Queue
def __main__():
if (len(argv) > 1) and (argv[-1] =="-sub-"):
import time, sys
print"Application runned!"
time.sleep(2)
print"Slept 2 second"
time.sleep(1)
print"Slept 1 additional second",
time.sleep(2)
sys.stderr.write("Stderr output after 5 seconds")
print"Eol on stdin"
sys.stderr.write("Eol on stderr
")
time.sleep(1)
print"Wow, we have end of work!",
else:
os.environ["PYTHONUNBUFFERED"]="1"
try:
p = Popen( argv + ["-sub-"],
bufsize=0, # line-buffered
stdin=PIPE, stdout=PIPE, stderr=PIPE )
except WindowsError, W:
if W.winerror==193:
p = Popen( argv + ["-sub-"],
shell=True, # Try to run via shell
bufsize=0, # line-buffered
stdin=PIPE, stdout=PIPE, stderr=PIPE )
else:
raise
inp = Queue.Queue()
sout = io.open(p.stdout.fileno(), 'rb', closefd=False)
serr = io.open(p.stderr.fileno(), 'rb', closefd=False)
def Pump(stream, category):
queue = Queue.Queue()
def rdr():
while True:
buf = stream.read1(8192)
if len(buf)>0:
queue.put( buf )
else:
queue.put( None )
return
def clct():
active = True
while active:
r = queue.get()
try:
while True:
r1 = queue.get(timeout=0.005)
if r1 is None:
active = False
break
else:
r += r1
except Queue.Empty:
pass
inp.put( (category, r) )
for tgt in [rdr, clct]:
th = Thread(target=tgt)
th.setDaemon(True)
th.start()
Pump(sout, 'stdout')
Pump(serr, 'stderr')

while p.poll() is None:
# App still working
try:
chan,line = inp.get(timeout = 1.0)
if chan=='stdout':
print"STDOUT>>", line,"<?<"
elif chan=='stderr':
print" ERROR==", line,"=?="
except Queue.Empty:
pass
print"Finish"

if __name__ == '__main__':
__main__()

相关讨论

在这里添加这个答案，因为它提供了在Windows和Unix上设置非阻塞管道的能力。

所有ctypes细节都归功于@ techtonik的回答。

在Unix和Windows系统上都有一个稍微修改过的版本。

Python3兼容(只需要很小的改动)。
包括posix版本，并定义要用于其中的例外。

这样，您可以对Unix和Windows代码使用相同的函数和异常。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78

# pipe_non_blocking.py (module)
"""
Example use:

p = subprocess.Popen(
command,
stdout=subprocess.PIPE,
)

pipe_non_blocking_set(p.stdout.fileno())

try:
data = os.read(p.stdout.fileno(), 1)
except PortableBlockingIOError as ex:
if not pipe_non_blocking_is_error_blocking(ex):
raise ex
"""

__all__ = (
"pipe_non_blocking_set",
"pipe_non_blocking_is_error_blocking",
"PortableBlockingIOError",
)

import os

if os.name =="nt":
def pipe_non_blocking_set(fd):
# Constant could define globally but avoid polluting the name-space
# thanks to: https://stackoverflow.com/questions/34504970
import msvcrt

from ctypes import windll, byref, wintypes, WinError, POINTER
from ctypes.wintypes import HANDLE, DWORD, BOOL

LPDWORD = POINTER(DWORD)

PIPE_NOWAIT = wintypes.DWORD(0x00000001)

def pipe_no_wait(pipefd):
SetNamedPipeHandleState = windll.kernel32.SetNamedPipeHandleState
SetNamedPipeHandleState.argtypes = [HANDLE, LPDWORD, LPDWORD, LPDWORD]
SetNamedPipeHandleState.restype = BOOL

h = msvcrt.get_osfhandle(pipefd)

res = windll.kernel32.SetNamedPipeHandleState(h, byref(PIPE_NOWAIT), None, None)
if res == 0:
print(WinError())
return False
return True

return pipe_no_wait(fd)

def pipe_non_blocking_is_error_blocking(ex):
if not isinstance(ex, PortableBlockingIOError):
return False
from ctypes import GetLastError
ERROR_NO_DATA = 232

return (GetLastError() == ERROR_NO_DATA)

PortableBlockingIOError = OSError
else:
def pipe_non_blocking_set(fd):
import fcntl
fl = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)
return True

def pipe_non_blocking_is_error_blocking(ex):
if not isinstance(ex, PortableBlockingIOError):
return False
return True

PortableBlockingIOError = BlockingIOError

为了避免读取不完整的数据，我最终编写了自己的readline生成器(返回每行的字节串)。

它是一个发电机，所以你可以...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46

def non_blocking_readlines(f, chunk=1024):
"""
Iterate over lines, yielding b'' when nothings left
or when new data is not yet available.

stdout_iter = iter(non_blocking_readlines(process.stdout))

line = next(stdout_iter) # will be a line or b''.
"""
import os

from .pipe_non_blocking import (
pipe_non_blocking_set,
pipe_non_blocking_is_error_blocking,
PortableBlockingIOError,
)

fd = f.fileno()
pipe_non_blocking_set(fd)

blocks = []

while True:
try:
data = os.read(fd, chunk)
if not data:
# case were reading finishes with no trailing newline
yield b''.join(blocks)
blocks.clear()
except PortableBlockingIOError as ex:
if not pipe_non_blocking_is_error_blocking(ex):
raise ex

yield b''
continue

while True:
n = data.find(b'
')
if n == -1:
break

yield b''.join(blocks) + data[:n + 1]
data = data[n + 1:]
blocks.clear()
blocks.append(data)

相关讨论

我有原始提问者的问题，但不想调用线程。我将Jesse的解决方案与来自管道的直接read()和我自己的用于行读取的缓冲处理程序混合在一起(但是，我的子进程 - ping - 总是写完整行<系统页面大小)。我只是通过阅读gobject-registered io手表来避免忙碌等待。这些天我通常在gobject MainLoop中运行代码以避免线程。

1
2
3
4
5
6
7
8
9

def set_up_ping(ip, w):
# run the sub-process
# watch the resultant pipe
p = subprocess.Popen(['/bin/ping', ip], stdout=subprocess.PIPE)
# make stdout a non-blocking file
fl = fcntl.fcntl(p.stdout, fcntl.F_GETFL)
fcntl.fcntl(p.stdout, fcntl.F_SETFL, fl | os.O_NONBLOCK)
stdout_gid = gobject.io_add_watch(p.stdout, gobject.IO_IN, w)
return stdout_gid # for shutting down

观察者是

1
2
3

def watch(f, *other):
print 'reading',f.read()
return True

主程序设置ping然后调用gobject邮件循环。

1
2
3
4

def main():
set_up_ping('192.168.1.8', watch)
# discard gid as unused here
gobject.MainLoop().run()

任何其他工作都附加到gobject中的回调。

为什么要打扰线程和队列？
与readline()不同，BufferedReader.read1()不会阻塞等待 r n，如果有任何输出进入，它会返回ASAP。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

#!/usr/bin/python
from subprocess import Popen, PIPE, STDOUT
import io

def __main__():
try:
p = Popen( ["ping","-n","3","127.0.0.1"], stdin=PIPE, stdout=PIPE, stderr=STDOUT )
except: print("Popen failed"); quit()
sout = io.open(p.stdout.fileno(), 'rb', closefd=False)
while True:
buf = sout.read1(1024)
if len(buf) == 0: break
print buf,

if __name__ == '__main__':
__main__()

相关讨论

在我的情况下，我需要一个记录模块，它捕获后台应用程序的输出并增加它(添加时间戳，颜色等)。

我最终得到了一个后台线程来完成实际的I / O.以下代码仅适用于POSIX平台。我剥去了非必要的部分。

如果有人打算长期使用这种野兽考虑管理开放描述符。在我看来，这不是一个大问题。

# -*- python -*-
import fcntl
import threading
import sys, os, errno
import subprocess

class Logger(threading.Thread):
def __init__(self, *modules):
threading.Thread.__init__(self)
try:
from select import epoll, EPOLLIN
self.__poll = epoll()
self.__evt = EPOLLIN
self.__to = -1
except:
from select import poll, POLLIN
print 'epoll is not available'
self.__poll = poll()
self.__evt = POLLIN
self.__to = 100
self.__fds = {}
self.daemon = True
self.start()

def run(self):
while True:
events = self.__poll.poll(self.__to)
for fd, ev in events:
if (ev&self.__evt) != self.__evt:
continue
try:
self.__fds[fd].run()
except Exception, e:
print e

def add(self, fd, log):
assert not self.__fds.has_key(fd)
self.__fds[fd] = log
self.__poll.register(fd, self.__evt)

class log:
logger = Logger()

def __init__(self, name):
self.__name = name
self.__piped = False

def fileno(self):
if self.__piped:
return self.write
self.read, self.write = os.pipe()
fl = fcntl.fcntl(self.read, fcntl.F_GETFL)
fcntl.fcntl(self.read, fcntl.F_SETFL, fl | os.O_NONBLOCK)
self.fdRead = os.fdopen(self.read)
self.logger.add(self.read, self)
self.__piped = True
return self.write

def __run(self, line):
self.chat(line, nl=False)

def run(self):
while True:
try: line = self.fdRead.readline()
except IOError, exc:
if exc.errno == errno.EAGAIN:
return
raise
self.__run(line)

def chat(self, line, nl=True):
if nl: nl = '
'
else: nl = ''
sys.stdout.write('[%s] %s%s' % (self.__name, line, nl))

def system(command, param=[], cwd=None, env=None, input=None, output=None):
args = [command] + param
p = subprocess.Popen(args, cwd=cwd, stdout=output, stderr=output, stdin=input, env=env, bufsize=0)
p.wait()

ls = log('ls')
ls.chat('go')
system("ls", ['-l', '/'], output=ls)

date = log('date')
date.chat('go')
system("date", output=date)

选择模块可帮助您确定下一个有用输入的位置。

但是，对于单独的线程，你几乎总是更开心。一个阻塞读取stdin，另一个阻止读取你不想阻止的地方。

相关讨论

在现代Python中，事情要好得多。

这是一个简单的子程序，"hello.py"：

1
2
3
4
5
6
7

#!/usr/bin/env python3

while True:
i = input()
if i =="quit":
break
print(f"hello {i}")

以及与之互动的程序：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

import asyncio

async def main():
proc = await asyncio.subprocess.create_subprocess_exec(
"./hello.py", stdin=asyncio.subprocess.PIPE, stdout=asyncio.subprocess.PIPE
)
proc.stdin.write(b"bob
")
print(await proc.stdout.read(1024))
proc.stdin.write(b"alice
")
print(await proc.stdout.read(1024))
proc.stdin.write(b"quit
")
await proc.wait()

asyncio.run(main())

打印出：

1
2
3
4

b'hello bob
'
b'hello alice
'

请注意，实际模式(此处和相关问题中的几乎所有先前答案)都是将子项的stdout文件描述符设置为非阻塞，然后在某种选择循环中轮询它。当然，这些循环由asyncio提供。

我最近偶然发现了同样的问题
我需要从流中读取一行(在子进程中尾部运行)
在非阻塞模式下
我想避免下一个问题：不要刻录cpu，不要按一个字节读取流(比如readline就行)等等

这是我的实施
https://gist.github.com/grubberr/5501e1a9760c3eab5e0a
它不支持windows(poll)，不处理EOF，
但它对我很有用

相关讨论

编辑：此实现仍然阻止。请改用J.F.Sebastian的答案。

~~我尝试了最佳答案，但线程代码的额外风险和维护令人担忧。~~

通过io模块(并限制为2.6)，我找到了BufferedReader。这是我的无线，无阻塞的解决方案。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import io
from subprocess import PIPE, Popen

p = Popen(['myprogram.exe'], stdout=PIPE)

SLEEP_DELAY = 0.001

# Create an io.BufferedReader on the file descriptor for stdout
with io.open(p.stdout.fileno(), 'rb', closefd=False) as buffer:
while p.poll() == None:
time.sleep(SLEEP_DELAY)
while '
' in bufferedStdout.peek(bufferedStdout.buffer_size):
line = buffer.readline()
# do stuff with the line

# Handle any remaining output after the process has ended
while buffer.peek():
line = buffer.readline()
# do stuff with the line

相关讨论

你试过for line in iter(p.stdout.readline,""): # do stuff with the line吗？它是无线程的(单线程)并在代码阻塞时阻塞。

@ j-f-sebastian是的，我最终回复了你的答案。我的实施仍偶尔被阻止。我会编辑我的答案，警告别人不要走这条路。

我根据J. F. Sebastian的解决方案创建了一个库。你可以使用它。

https://github.com/cenkalti/what

相关讨论

哇，这是一个测试模块。

这是在子进程中运行交互式命令的示例，stdout是使用伪终端进行交互的。您可以参考：https：//stackoverflow.com/a/43012138/3555925

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#!/usr/bin/env python
# -*- coding: utf-8 -*-

import os
import sys
import select
import termios
import tty
import pty
from subprocess import Popen

command = 'bash'
# command = 'docker run -it --rm centos /bin/bash'.split()

# save original tty setting then set it to raw mode
old_tty = termios.tcgetattr(sys.stdin)
tty.setraw(sys.stdin.fileno())

# open pseudo-terminal to interact with subprocess
master_fd, slave_fd = pty.openpty()

# use os.setsid() make it run in a new process group, or bash job control will not be enabled
p = Popen(command,
preexec_fn=os.setsid,
stdin=slave_fd,
stdout=slave_fd,
stderr=slave_fd,
universal_newlines=True)

while p.poll() is None:
r, w, e = select.select([sys.stdin, master_fd], [], [])
if sys.stdin in r:
d = os.read(sys.stdin.fileno(), 10240)
os.write(master_fd, d)
elif master_fd in r:
o = os.read(master_fd, 10240)
if o:
os.write(sys.stdout.fileno(), o)

# restore tty settings back
termios.tcsetattr(sys.stdin, termios.TCSADRAIN, old_tty)

我的问题有点不同，因为我想从正在运行的进程中收集stdout和stderr，但最终是相同的，因为我想在窗口小部件中生成输出。

我不想使用队列或其他线程来提出许多建议的解决方法，因为它们不需要执行诸如运行另一个脚本和收集其输出之类的常见任务。

在阅读了提出的解决方案和python文档后，我通过下面的实现解决了我的问题。是的它只适用于POSIX，因为我正在使用select函数调用。

我同意这些文档令人困惑，并且这种常见的脚本编写任务的实现很尴尬。我相信旧版本的python对Popen有不同的默认设置和不同的解释，因此造成了很多混乱。这似乎适用于Python 2.7.12和3.5.2。

关键是将bufsize=1设置为行缓冲，然后将universal_newlines=True设置为文本文件而不是二进制文件，这在设置bufsize=1时似乎成为默认值。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
class workerThread(QThread):
def __init__(self, cmd):
QThread.__init__(self)
self.cmd = cmd
self.result = None ## return code
self.error = None ## flag indicates an error
self.errorstr ="" ## info message about the error

def __del__(self):
self.wait()
DEBUG("Thread removed")

def run(self):
cmd_list = self.cmd.split("")
try:
cmd = subprocess.Popen(cmd_list, bufsize=1, stdin=None
, universal_newlines=True
, stderr=subprocess.PIPE
, stdout=subprocess.PIPE)
except OSError:
self.error = 1
self.errorstr ="Failed to execute" + self.cmd
ERROR(self.errorstr)
finally:
VERBOSE("task started...")
import select
while True:
try:
r,w,x = select.select([cmd.stdout, cmd.stderr],[],[])
if cmd.stderr in r:
line = cmd.stderr.readline()
if line !="":
line = line.strip()
self.emit(SIGNAL("update_error(QString)"), line)
if cmd.stdout in r:
line = cmd.stdout.readline()
if line =="":
break
line = line.strip()
self.emit(SIGNAL("update_output(QString)"), line)
except IOError:
pass
cmd.wait()
self.result = cmd.returncode
if self.result < 0:
self.error = 1
self.errorstr ="Task terminated by signal" + str(self.result)
ERROR(self.errorstr)
return
if self.result:
self.error = 1
self.errorstr ="exit code" + str(self.result)
ERROR(self.errorstr)
return
return

ERROR，DEBUG和VERBOSE只是将输出打印到终端的宏。

这个解决方案是IMHO 99.99％有效，因为它仍然使用阻塞readline函数，所以我们假设子过程很好并输出完整的行。

我欢迎反馈来改进解决方案，因为我还是Python新手。

相关讨论

在这种特殊情况下，您可以在Popen构造函数中设置stderr = subprocess.STDOUT，并从cmd.stdout.readline()获取所有输出。

此解决方案使用select模块从IO流中"读取任何可用数据"。此功能最初会阻塞，直到数据可用，但随后只读取可用且不会进一步阻塞的数据。

鉴于它使用select模块，这只适用于Unix。

该代码完全符合PEP8标准。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
import select

def read_available(input_stream, max_bytes=None):
"""
Blocks until any data is available, then all available data is then read and returned.
This function returns an empty string when end of stream is reached.

Args:
input_stream: The stream to read from.
max_bytes (int|None): The maximum number of bytes to read. This function may return fewer bytes than this.

Returns:
str
"""
# Prepare local variables
input_streams = [input_stream]
empty_list = []
read_buffer =""

# Initially block for input using 'select'
if len(select.select(input_streams, empty_list, empty_list)[0]) > 0:

# Poll read-readiness using 'select'
def select_func():
return len(select.select(input_streams, empty_list, empty_list, 0)[0]) > 0

# Create while function based on parameters
if max_bytes is not None:
def while_func():
return (len(read_buffer) < max_bytes) and select_func()
else:
while_func = select_func

while True:
# Read single byte at a time
read_data = input_stream.read(1)
if len(read_data) == 0:
# End of stream
break
# Append byte to string buffer
read_buffer += read_data
# Check if more data is available
if not while_func():
break

# Return read buffer
return read_buffer

我也遇到了Jesse描述的问题并通过使用"选择"作为Bradley解决了它，Andy和其他人做了但是在阻止模式下避免了繁忙的循环。它使用虚拟管道作为假stdin。 select块并等待stdin或管道准备就绪。当按下某个键时，stdin取消阻塞选择，并且可以使用read(1)检索键值。当一个不同的线程写入管道时，管道解除阻塞选择，它可以作为stdin需求结束的指示。这是一些参考代码：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
import sys
import os
from select import select

# -------------------------------------------------------------------------
# Set the pipe (fake stdin) to simulate a final key stroke
# which will unblock the select statement
readEnd, writeEnd = os.pipe()
readFile = os.fdopen(readEnd)
writeFile = os.fdopen(writeEnd,"w")

# -------------------------------------------------------------------------
def getKey():

# Wait for stdin or pipe (fake stdin) to be ready
dr,dw,de = select([sys.__stdin__, readFile], [], [])

# If stdin is the one ready then read it and return value
if sys.__stdin__ in dr:
return sys.__stdin__.read(1) # For Windows use ----> getch() from module msvcrt

# Must finish
else:
return None

# -------------------------------------------------------------------------
def breakStdinRead():
writeFile.write(' ')
writeFile.flush()

# -------------------------------------------------------------------------
# MAIN CODE

# Get key stroke
key = getKey()

# Keyboard input
if key:
# ... do your stuff with the key value

# Faked keystroke
else:
# ... use of stdin finished

# -------------------------------------------------------------------------
# OTHER THREAD CODE

breakStdinRead()

相关讨论

注意：为了使其在Windows中工作，管道应该由套接字替换。我还没有尝试，但它应该根据文档工作。

根据J.F. Sebastian的回答和其他几个来源，我已经整理了一个简单的子流程管理器。它提供请求非阻塞读取，以及并行运行多个进程。它不使用任何特定于操作系统的调用(我知道)，因此应该在任何地方工作。

它可以从pypi获得，所以只需pip install shelljob。有关示例和完整文档，请参阅项目页面。

这是一个支持python中的非阻塞读取和后台写入的模块：

https://pypi.python.org/pypi/python-nonblock

提供功能，

nonblock_read将从流中读取数据(如果可用)，否则返回空字符串(如果在另一端关闭流并且已读取所有可能的数据，则返回None)

你也可以考虑python-subprocess2模块，

https://pypi.python.org/pypi/python-subprocess2

它添加到子进程模块。所以在从"subprocess.Popen"返回的对象上添加了另一个方法runInBackground。这将启动一个线程并返回一个对象，该对象将自动填充，因为将东西写入stdout / stderr，而不会阻塞主线程。

请享用！

相关讨论

我想试试这个非阻塞模块，但我在某些Linux程序中相对较新。究竟如何安装这些例程？我正在运行Raspbian Jessie，这是Raspberry Pi的Debian Linux风格。我试过'sudo apt-get install nonblock'和python-nonblock并且都抛出错误 - 没找到。我从这个网站pypi.python.org/pypi/python-nonblock下载了zip文件，但不知道如何处理它。谢谢.... RDK

shell:用Python调用外部命令

oop:Python中的元类是什么?

查找给定列表中包含Python项的项的索引

Python中append和extend列表方法的区别

exception:如何在Python中安全地创建嵌套目录?

如何在Java中读取/转换输入流到字符串中？InputStream into a String

Python有三元条件运算符吗?

datetime:如何在Python中获取当前时间

Python有字符串容器吗?子字符串方法?

关于python：在一行中捕获多个异常(块除外)

shell:用Python调用外部命令