关于python：阅读Http Stream

Read Http Stream

我试图从使用Chunked Transfer Encoding发送数据的流API中读取。每个块可以有多个记录，每个记录由CRLF分隔。并且始终使用gzip压缩发送数据。我正在尝试获取Feed，然后一次进行一些处理。我已经浏览了一堆stackOverflow资源，但无法找到在Python中执行此操作的方法。在我的情况下，iter_content(块)大小是在行上抛出异常。

1	for chunk in api_response.iter_content(chunk_size=1024):

在Fiddler(我作为代理使用)中，我可以看到数据被不断下载并在Fiddler中执行"COMETPeek"，我实际上可以看到一些示例json。

即使iter_lines也行不通。我看过这里提到的asyncio和aiohttp案例：为什么request.get()没有返回？ request.get()使用的默认超时是多少？

但不知道如何处理。正如您所看到的，我尝试过使用一堆python库。抱歉，有些代码可能有一些库，我后来从使用中删除了，因为它没有用完。

我还查看了请求库的文档，但找不到任何实质性内容。

如上所述，下面是我尝试做的示例代码。任何关于我应该如何进行的指示都将受到高度赞赏。

这是我第一次尝试读取流

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45

from oauthlib.oauth2 import BackendApplicationClient
from requests_oauthlib import OAuth2Session
import requests
import zlib
import json

READ_BLOCK_SIZE = 1024*8

clientID="ClientID"
clientSecret="ClientSecret"

proxies = {
"https":"http://127.0.0.1:8888",
}

client = BackendApplicationClient(client_id=clientID)
oauth = OAuth2Session(client=client)

token = oauth.fetch_token(token_url='https://baseTokenURL/token', client_id=clientID,client_secret=clientSecret,proxies=proxies,verify=False)

auth_t=token['access_token']
#auth_t = accesstoken.encode("ascii","ignore")

headers = {
'authorization':"Bearer" + auth_t,
'content-type':"application/json",
'Accept-Encoding':"gzip",
}
dec=zlib.decompressobj(32 + zlib.MAX_WBITS)

try:
init_res = requests.get('https://BaseStreamURL/api/1/stream/specificStream', headers=headers, allow_redirects=False,proxies=proxies,verify=False)
if init_res.status_code == 302:
print(init_res.headers['Location'])
api_response = requests.get(init_res.headers['Location'], headers=headers, allow_redirects=False,proxies=proxies,verify=False, timeout=20, stream=True,params={"smoothing":"1","smoothingBucketSize" :"180"})
if api_response.status_code == 200:
#api_response.raw.decode_content = True

#print(api_response.raw.read(20))
for chunk in api_response.iter_content(chunk_size=api_response.chunk_size):
#Parse the response
elif init_res.status_code == 200:
print(init_res.content)
except Exception as ce:
print(ce)

UPDATE
我现在正在看这个：https：//aiohttp.readthedocs.io/en/v0.20.0/client.html

这会是要走的路吗？

以防有人发现这有用。我找到了一种使用aiohttp从api流式传输到python的方法。下面是骨架。请记住，它只是一个骨架，它通过不断向我显示结果来工作。如果有人有更好的方法 - 我是耳朵和眼睛，因为这是我第一次尝试捕捉溪流。

1
2
3
4
5
6
7
8
9
10
11
12

async def fetch(session, url, headers):
with async_timeout.timeout(None):
async with session.get(init_res.headers['Location'], headers=headers, proxy="http://127.0.0.1:8888", allow_redirects=False,timeout=None) as r:
while True:
chunk=await r.content.read(1024*3)
if not chunk:
break
print(chunk)

async def main(url, headers):
async with aiohttp.ClientSession() as session:
html = await fetch(session, url,headers)

在来电者

1
2
3
4
5
6
7
8
9
10

try:
init_res = requests.get('https://BaseStreamURL/api/1/stream/specificStream', headers=headers, allow_redirects=False,proxies=proxies,verify=False)
if init_res.status_code == 302:
loc=init_res.headers['Location']
loop = asyncio.get_event_loop()
loop.run_until_complete(main(loc, headers=headers))
elif init_res.status_code == 200:
print(init_res.content)
except Exception as ce:
print(ce)