简体   繁体   English

如何将大数据流式传输到 Google Cloud Storage?

[英]How can I stream big data to Google Cloud Storage?

I am working on a system for analyzing data of any size and format streamed by the users to my private cloud based on Google Cloud Storage.我正在开发一个系统,用于分析用户基于 Google Cloud Storage 流式传输到我的私有云的任何大小和格式的数据。 Do you have any ideas how can I allow them to stream big data?您有什么想法可以让他们传输大数据吗? At the moment I use Django API and I do this in this way:目前我使用 Django API 并以这种方式执行此操作:

def upload_blob(source_file_name, destination_blob_name):
    blob = bucket.blob(destination_blob_name)
    blob.upload_from_filename(source_file_name)
    print('File {} uploaded to {}.'.format(
        source_file_name,
        destination_blob_name))

It works correctly with small files however when I send for example large movie I get the error shown below.它适用于小文件,但是当我发送例如大电影时,我收到如下所示的错误。 I am aware that this is not the optimal solution but I have no idea how can I solve this.我知道这不是最佳解决方案,但我不知道如何解决这个问题。 As you can notice at the moment they send me requests with the blob format but with very large files it does not work.正如您目前所注意到的,他们向我发送了 blob 格式的请求,但对于非常大的文件,它不起作用。 Do you have any ideas how can I solve my problem and send users data of any size to Google Cloud Storage?您对如何解决我的问题并将任意大小的用户数据发送到 Google Cloud Storage 有什么想法吗?

Internal Server Error: /cloud/
Traceback (most recent call last):
 File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen
  chunked=chunked,
 File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
  conn.request(method, url, **httplib_request_kw)
 File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1252, in request
  self._send_request(method, url, body, headers, encode_chunked)
 File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1298, in _send_request
  self.endheaders(body, encode_chunked=encode_chunked)
 File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1247, in endheaders
  self._send_output(message_body, encode_chunked=encode_chunked)
 File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1065, in _send_output
  self.send(chunk)
 File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 987, in send
  self.sock.sendall(data)
 File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 1034, in sendall
  v = self.send(byte_view[count:])
 File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 1003, in send
  return self._sslobj.write(data)
socket.timeout: The write operation timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
 File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
  timeout=timeout
 File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 720, in urlopen
  method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
 File "/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 400, in increment
  raise six.reraise(type(error), error, _stacktrace)
 File "/usr/local/lib/python3.7/site-packages/urllib3/packages/six.py", line 734, in reraise
  raise value.with_traceback(tb)
 File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen
  chunked=chunked,
 File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
  conn.request(method, url, **httplib_request_kw)
 File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1252, in request
  self._send_request(method, url, body, headers, encode_chunked)
 File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1298, in _send_request
  self.endheaders(body, encode_chunked=encode_chunked)
 File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1247, in endheaders
  self._send_output(message_body, encode_chunked=encode_chunked)
 File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1065, in _send_output
  self.send(chunk)
 File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 987, in send
  self.sock.sendall(data)
 File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 1034, in sendall
  v = self.send(byte_view[count:])
 File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 1003, in send
  return self._sslobj.write(data)
urllib3.exceptions.ProtocolError: ('Connection aborted.', timeout('The write operation timed out'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
 File "/usr/local/lib/python3.7/site-packages/django/core/handlers/exception.py", line 34, in inner
  response = get_response(request)
 File "/usr/local/lib/python3.7/site-packages/django/core/handlers/base.py", line 115, in _get_response
  response = self.process_exception_by_middleware(e, request)
 File "/usr/local/lib/python3.7/site-packages/django/core/handlers/base.py", line 113, in _get_response
  response = wrapped_callback(request, *callback_args, **callback_kwargs)
 File "/usr/local/lib/python3.7/site-packages/django/views/decorators/csrf.py", line 54, in wrapped_view
  return view_func(*args, **kwargs)
 File "/usr/local/lib/python3.7/site-packages/django/views/generic/base.py", line 71, in view
  return self.dispatch(request, *args, **kwargs)
 File "/usr/local/lib/python3.7/site-packages/rest_framework/views.py", line 505, in dispatch
  response = self.handle_exception(exc)
 File "/usr/local/lib/python3.7/site-packages/rest_framework/views.py", line 465, in handle_exception
  self.raise_uncaught_exception(exc)
 File "/usr/local/lib/python3.7/site-packages/rest_framework/views.py", line 476, in raise_uncaught_exception
  raise exc
 File "/usr/local/lib/python3.7/site-packages/rest_framework/views.py", line 502, in dispatch
  response = handler(request, *args, **kwargs)
 File "/mypath/backend/views.py", line 635, in post
  'user/' + str(user_name) + '/' + str(file))
 File "/mypath/backend/views.py", line 214, in upload_blob
  blob.upload_from_filename(source_file_name)
 File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1318, in upload_from_filename
  predefined_acl=predefined_acl,
 File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1263, in upload_from_file
  client, file_obj, content_type, size, num_retries, predefined_acl
 File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1173, in _do_upload
  client, stream, content_type, size, num_retries, predefined_acl
 File "/usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py", line 1120, in _do_resumable_upload
  response = upload.transmit_next_chunk(transport)
 File "/usr/local/lib/python3.7/site-packages/google/resumable_media/requests/upload.py", line 425, in transmit_next_chunk
  retry_strategy=self._retry_strategy,
 File "/usr/local/lib/python3.7/site-packages/google/resumable_media/requests/_helpers.py", line 136, in http_request
  return _helpers.wait_and_retry(func, RequestsMixin._get_status_code, retry_strategy)
 File "/usr/local/lib/python3.7/site-packages/google/resumable_media/_helpers.py", line 150, in wait_and_retry
  response = func()
 File "/usr/local/lib/python3.7/site-packages/google/auth/transport/requests.py", line 216, in request
  method, url, data=data, headers=request_headers, **kwargs
 File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
  resp = self.send(prep, **send_kwargs)
 File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
  r = adapter.send(request, **kwargs)
 File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 498, in send
  raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', timeout('The write operation timed out'))
[24/Mar/2020 19:17:26] "POST /cloud/ HTTP/1.1" 500 20879

Have a look atResumable uploads .看看可恢复的上传

This option provides a resumable data transfer feature that lets you resume upload operations after a communication failure has interrupted the flow of data.此选项提供可恢复的数据传输功能,可让您在通信故障中断数据流后恢复上传操作。

Especially useful if you are transferring large files, because the likelihood of a network interruption or some other transmission failures is high.如果您正在传输大文件,这尤其有用,因为网络中断或其他一些传输失败的可能性很高。 In case of a failure, you do not have to restart large file uploads from the beginning when using this option.万一失败,使用此选项时,您不必从头开始重新启动大文件上传。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何 stream 将 CSV 数据上传到谷歌云存储(Python) - How to stream upload CSV data to Google Cloud Storage (Python) 如何从Google Big Query将数据加载到Google Cloud SQL? - How can I load data into Google Cloud SQL from Google Big Query? 如何将数据从 Google PubSub 主题流式传输到 PySpark(在 Google Cloud 上) - How can I stream data from a Google PubSub topic into PySpark (on Google Cloud) 如何将数据从 Google Cloud Storage 检索到 SQL Server - How can I retrieve data from Google Cloud Storage into the SQL Server 如何使用AJAX将图像上传到Google云存储? - How can I upload an image using AJAX to google cloud storage? 如何将OAuth2Decorator与Google Cloud Storage结合使用? - How can I use the OAuth2Decorator with Google Cloud Storage? 如何使用 Python 将 Cloud Storage 数据加载到 Bigquery? - How can I load Cloud Storage data into Bigquery using Python? 如何将数据从Google存储云读取到Google云数据板 - How to read data from Google storage cloud to Google cloud datalab 使用Google Cloud Storage API处理大文件 - Handling big files with Google Cloud Storage API Google Cloud Storage / Big Query成本估算 - Google Cloud Storage/Big Query cost estimation
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM