Setting up 'encoding' in Python's gzip.open() doesn't seem to work
即使我试图在python的gzip.open()中指定编码,它似乎总是使用cp1252.py来编码文件的内容。我的代码:
1 2 | with gzip.open('file.gz', 'rt', 'cp1250') as f: content = f.read() |
回应:
File"C:\Python34\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 52893: character maps to undefined
Python 3 x
gzip.open(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None)
因此,
这显然是错误的,因为目的是使用"cp1250"编码。
1 2 3 | gzip.open('file.gz', 'rt', 5, 'cp1250') # 4th positional argument gzip.open('file.gz', 'rt', encoding='cp1250') # keyword argument |
Python 2 x
python 2版本的
1 2 3 4 | with gzip.open('file.gz', 'rb') as f: data = f.read() decoded_data = data.decode('cp1250') |