关于字符串：python中的URL编码

URL encoding in python

在urllib或其他库中是否缺少此任务的简单方法？ URL编码将不安全的ASCII字符替换为"％"，后跟两个十六进制数字。

这是输入和预期输出的示例：

1
2
3

Mozilla/5.0 (Linux; U; Android 4.0; xx-xx; Galaxy Nexus Build/IFL10C) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30

Mozilla%2F5.0+%28Linux%3B+U%3B+Android+4.0%3B+xx-xx%3B+Galaxy+Nexus+Build%2FIFL10C%29+AppleWebKit%2F534.30+%28KHTML%2C+like+Gecko%29+Version%2F4.0+Mobile+Safari%2F534.30

对于Python 2.x，使用urllib.quote

Replace special characters in string using the %xx escape. Letters, digits, and the characters '_.-' are never quoted. By default, this function is intended for quoting the path section of the URL. The optional safe parameter specifies additional characters that should not be quoted — its default value is '/'.

例：

1
2
3
4

In [1]: import urllib

In [2]: urllib.quote('%')
Out[2]: '%25'

编辑：

对于您的情况，为了用加号替换空格，可以使用urllib.quote_plus

例：

1 2	In [4]: urllib.quote_plus('a b') Out[4]: 'a+b'

对于Python 3.x，使用quote

1
2
3
4

>>> import urllib
>>> a ="asdas#@das"
>>> urllib.parse.quote(a)
'asdas%23%40das'

对于带空格的字符串，请使用quote_plus

1
2
3
4

>>> import urllib
>>> a ="as da& s#@das"
>>> urllib.parse.quote_plus(a)
'as+da%26+s%23%40das'

相关讨论

请记住，如果输入是unicode字符串，则urllib.quote和urllib.quote_plus都会引发错误：

1
2
3
4
5
6
7
8

s = u'\\u2013'
urllib.quote(s)

Traceback (most recent call last):
File"<stdin>", line 1, in <module>
File"C:\\Python27\\lib\\urllib.py", line 1303, in quote
return ''.join(map(quoter, s))
KeyError: u'\\u2013'

正如在SO上回答的那样，必须显式使用'UTF-8'：

1	urllib.quote(s.encode('utf-8'))

另外，如果您有多个值的格，则最好的方法是urllib.urlencode。