关于python:获取Urllib2.Request的请求标头?

Get Request Headers for Urllib2.Request?

有没有办法从使用Urllib2创建的请求中获取标头,或者确认使用urllib2.urlopen发送的HTTP标头?


查看请求(和响应标头)的简单方法是启用调试输出:

1
opener = urllib2.build_opener(urllib2.HTTPHandler(debuglevel=1))

然后,您可以看到发送/接收的精确标题:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
>>> opener.open('http://python.org')
send: 'GET / HTTP/1.1

Accept-Encoding: identity

Host: python.org

Connection: close

User-Agent: Python-urllib/2.7



'

reply: 'HTTP/1.1 200 OK

'

header: Date: Tue, 14 Jun 2011 08:23:35 GMT
header: Server: Apache/2.2.16 (Debian)
header: Last-Modified: Mon, 13 Jun 2011 19:41:35 GMT
header: ETag:"105800d-486d-4a59d1b6699c0"
header: Accept-Ranges: bytes
header: Content-Length: 18541
header: Connection: close
header: Content-Type: text/html
header: X-Pad: avoid browser bug
>

您也可以在发出请求之前使用urllib2.Request对象标头进行设置(并覆盖默认标头,但事先不会出现在标头dict中):

1
2
3
4
>>> req = urllib2.Request(url='http://python.org')
>>> req.add_header('User-Agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:5.0)')
>>> req.headers
{'User-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:5.0)'}