Python请求的异步请求

Asynchronous Requests with Python requests

我尝试了在Python请求库的文档中提供的示例。

使用async.map(rs)，我得到了响应代码，但我想得到请求的每个页面的内容。例如，这不起作用：

1 2	out = async.map(rs) print out[0].content

相关讨论

注释

以下答案不适用于请求v0.13.0+。在编写这个问题之后，异步功能被移到了grequest。但是，您可以用下面的grequests替换requests，它应该可以工作。

我留下这个答案是为了反映最初关于使用请求

要使用async.map异步执行多个任务，您必须：

为要对每个对象(任务)执行的操作定义函数

在您的请求中添加该函数作为事件挂钩

在所有请求/行动的列表中呼叫async.map。

例子：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

from requests import async
# If using requests > v0.13.0, use
# from grequests import async

urls = [
'http://python-requests.org',
'http://httpbin.org',
'http://python-guide.org',
'http://kennethreitz.com'
]

# A simple task to do to each response object
def do_something(response):
print response.url

# A list to hold our things to do via async
async_list = []

for u in urls:
# The"hooks = {..." part is where you define what you want to do
#
# Note the lack of parentheses following do_something, this is
# because the response will be used as the first argument automatically
action_item = async.get(u, hooks = {'response' : do_something})

# Add the task to our list of things to do via async
async_list.append(action_item)

# Do our list of things to do via async
async.map(async_list)

相关讨论

async现在是一个独立的模块：grequests。

请参见：https://github.com/kennethreitz/grequests

还有：通过python发送多个HTTP请求的理想方法？

安装：

1	$ pip install grequests

用途：

构建堆栈：

1
2
3
4
5
6
7
8
9
10
11

import grequests

urls = [
'http://www.heroku.com',
'http://tablib.org',
'http://httpbin.org',
'http://python-requests.org',
'http://kennethreitz.com'
]

rs = (grequests.get(u) for u in urls)

发送堆栈

1	grequests.map(rs)

结果看起来像

1	[<Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>]

grequest似乎没有为并发请求设置限制，即当多个请求发送到同一个服务器时。

相关讨论

我测试了未来请求和grequest。grequest速度更快，但会带来猴子补丁和其他依赖性问题。请求预购比grequest慢几倍。我决定将自己的请求简单地封装到threadpollexecutor中，速度几乎和grequest一样快，但没有外部依赖性。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

import requests
import concurrent.futures

def get_urls():
return ["url1","url2"]

def load_url(url, timeout):
return requests.get(url, timeout = timeout)

with concurrent.futures.ThreadPoolExecutor(max_workers=20) as executor:

future_to_url = {executor.submit(load_url, url, 10): url for url in get_urls()}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
resp_err = resp_err + 1
else:
resp_ok = resp_ok + 1

相关讨论

也许要求期货是另一个选择。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

from requests_futures.sessions import FuturesSession

session = FuturesSession()
# first request is started in background
future_one = session.get('http://httpbin.org/get')
# second requests is started immediately
future_two = session.get('http://httpbin.org/get?foo=bar')
# wait for the first request to complete, if it hasn't already
response_one = future_one.result()
print('response one status: {0}'.format(response_one.status_code))
print(response_one.content)
# wait for the second request to complete, if it hasn't already
response_two = future_two.result()
print('response two status: {0}'.format(response_two.status_code))
print(response_two.content)

在Office文档中也建议这样做。如果你不想牵扯到Gevent，那是个不错的选择。

相关讨论

我知道这已经关闭了一段时间，但我认为在请求库上提升另一个异步解决方案可能会很有用。

1
2
3
4
5

list_of_requests = ['http://moop.com', 'http://doop.com', ...]

from simple_requests import Requests
for response in Requests().swarm(list_of_requests):
print response.content

文档如下：http://pythonhosted.org/simple-requests/

相关讨论

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

threads=list()

for requestURI in requests:
t = Thread(target=self.openURL, args=(requestURI,))
t.start()
threads.append(t)

for thread in threads:
thread.join()

...

def openURL(self, requestURI):
o = urllib2.urlopen(requestURI, timeout = 600)
o...

相关讨论

一段时间以来，我一直在使用Python请求对Github的Gist API进行异步调用。

例如，请参见下面的代码：

https://github.com/davidthewatson/flasgist/blob/master/views.py_l60-72

这种风格的python可能不是最清楚的例子，但我可以向您保证代码可以工作。如果这让你感到困惑，请告诉我，我会记录下来。

如果您想使用asyncio，那么requests-async为requests提供异步/等待功能-https://github.com/encode/requests-async

我还尝试了一些在Python中使用异步方法的方法，我有多么幸运地使用Twisted进行异步编程。它有较少的问题，并且有很好的文档记录。这是一个链接的东西西米拉尔什么你正在试图扭曲。

http://pythonquirks.blogspot.com/2011/04/twisted-asynchronous-http-request.html