can't upload > ~2GB to Google Cloud Storage
跟踪如下。
相关的python代码段:
1 2 3 | bucket = _get_bucket(location['bucket']) blob = bucket.blob(location['path']) blob.upload_from_filename(source_path) |
最终触发(来自SSL库):
OverflowError: string longer than 2147483647 bytes
我想我缺少一些特殊的配置选项?
这可能与这个~1.5年前的问题有关:https://github.com/googledatalab/datalab/issues/784。
感谢您的帮助!
全迹:
[File"/usr/src/app/gcloud/download_data.py", line 109, in *******
blob.upload_from_filename(source_path)File"/usr/local/lib/python3.5/dist-packages/google/cloud/storage/blob.py", line 992, in upload_from_filename
size=total_bytes)File"/usr/local/lib/python3.5/dist-packages/google/cloud/storage/blob.py", line 946, in upload_from_file
client, file_obj, content_type, size, num_retries)File"/usr/local/lib/python3.5/dist-packages/google/cloud/storage/blob.py", line 867, in _do_upload
client, stream, content_type, size, num_retries)File"/usr/local/lib/python3.5/dist-packages/google/cloud/storage/blob.py", line 700, in _do_multipart_upload
transport, data, object_metadata, content_type)File"/usr/local/lib/python3.5/dist-packages/google/resumable_media/requests/upload.py", line 97, in transmit
retry_strategy=self._retry_strategy)File"/usr/local/lib/python3.5/dist-packages/google/resumable_media/requests/_helpers.py", line 101, in http_request
func, RequestsMixin._get_status_code, retry_strategy)File"/usr/local/lib/python3.5/dist-packages/google/resumable_media/_helpers.py", line 146, in wait_and_retry
response = func()File"/usr/local/lib/python3.5/dist-packages/google/auth/transport/requests.py", line 186, in request
method, url, data=data, headers=request_headers, **kwargs)File"/usr/local/lib/python3.5/dist-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)File"/usr/local/lib/python3.5/dist-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)File"/usr/local/lib/python3.5/dist-packages/requests/adapters.py", line 440, in send
timeout=timeoutFile"/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)File"/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 357, in _make_request
conn.request(method, url, **httplib_request_kw)File"/usr/lib/python3.5/http/client.py", line 1106, in request
self._send_request(method, url, body, headers)File"/usr/lib/python3.5/http/client.py", line 1151, in _send_request
self.endheaders(body)File"/usr/lib/python3.5/http/client.py", line 1102, in endheaders
self._send_output(message_body)File"/usr/lib/python3.5/http/client.py", line 936, in _send_output
self.send(message_body)File"/usr/lib/python3.5/http/client.py", line 908, in send
self.sock.sendall(data)File"/usr/lib/python3.5/ssl.py", line 891, in sendall
v = self.send(data[count:])File"/usr/lib/python3.5/ssl.py", line 861, in send
return self._sslobj.write(data)File"/usr/lib/python3.5/ssl.py", line 586, in write
return self._sslobj.write(data)OverflowError: string longer than 2147483647 bytes
问题是它试图将整个文件读取到内存中。从
相反,在创建对象时指定
1 2 3 | # Must be a multiple of 256KB per docstring CHUNK_SIZE = 10485760 # 10MB blob = bucket.blob(location['path'], chunk_size=CHUNK_SIZE) |
快乐黑客!