关于python:我如何在asyncio中使用请求?

How could I use requests in asyncio?

我想在asyncio中执行并行的http请求任务,但我发现python-requests会阻止asyncio的事件循环。 我发现了aiohttp,但它无法使用http代理提供http请求服务。

所以我想知道是否有办法在asyncio的帮助下进行异步http请求。


要使用asyncio的请求(或任何其他阻塞库),您可以使用BaseEventLoop.run_in_executor在另一个线程中运行一个函数并从中获取以获得结果。例如:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import asyncio
import requests

@asyncio.coroutine
def main():
    loop = asyncio.get_event_loop()
    future1 = loop.run_in_executor(None, requests.get, 'http://www.google.com')
    future2 = loop.run_in_executor(None, requests.get, 'http://www.google.co.uk')
    response1 = yield from future1
    response2 = yield from future2
    print(response1.text)
    print(response2.text)

loop = asyncio.get_event_loop()
loop.run_until_complete(main())

这将同时获得两个响应。

使用python 3.5,您可以使用新的await / async语法:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
import asyncio
import requests

async def main():
    loop = asyncio.get_event_loop()
    future1 = loop.run_in_executor(None, requests.get, 'http://www.google.com')
    future2 = loop.run_in_executor(None, requests.get, 'http://www.google.co.uk')
    response1 = await future1
    response2 = await future2
    print(response1.text)
    print(response2.text)

loop = asyncio.get_event_loop()
loop.run_until_complete(main())

有关更多信息,请参阅PEP0492。


aiohttp已经可以与HTTP代理一起使用:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import asyncio
import aiohttp


@asyncio.coroutine
def do_request():
    proxy_url = 'http://localhost:8118'  # your proxy address
    response = yield from aiohttp.request(
        'GET', 'http://google.com',
        proxy=proxy_url,
    )
    return response

loop = asyncio.get_event_loop()
loop.run_until_complete(do_request())


上面的答案仍然使用旧的Python 3.4风格协程。如果你有Python 3.5+,这就是你要写的。

aiohttp现在支持http代理

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
import aiohttp
import asyncio

async def fetch(session, url):
    async with session.get(url) as response:
        return await response.text()

async def main():
    urls = [
            'http://python.org',
            'https://google.com',
            'http://yifei.me'
        ]
    tasks = []
    async with aiohttp.ClientSession() as session:
        for url in urls:
            tasks.append(fetch(session, url))
        htmls = await asyncio.gather(*tasks)
        for html in htmls:
            print(html[:100])

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())


请求目前不支持asyncio,并且没有计划提供此类支持。您可能可以实现知道如何使用asyncio的自定义"传输适配器"(如此处所述)。

如果我发现自己有一段时间,我可能会真正研究,但我不能保证任何事情。


Pimin Konstantin Kefaloukos在文章中有一个很好的async / await循环和线程的例子
使用Python和asyncio轻松并行HTTP请求:

To minimize the total completion time, we could increase the size of the thread pool to match the number of requests we have to make. Luckily, this is easy to do as we will see next. The code listing below is an example of how to make twenty asynchronous HTTP requests with a thread pool of twenty worker threads:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# Example 3: asynchronous requests with larger thread pool
import asyncio
import concurrent.futures
import requests

async def main():

    with concurrent.futures.ThreadPoolExecutor(max_workers=20) as executor:

        loop = asyncio.get_event_loop()
        futures = [
            loop.run_in_executor(
                executor,
                requests.get,
                'http://example.org/'
            )
            for i in range(20)
        ]
        for response in await asyncio.gather(*futures):
            pass


loop = asyncio.get_event_loop()
loop.run_until_complete(main())