简体   繁体   English

使用 Python 将 csv 大文件上传到云存储

[英]Upload large csv file to cloud storage using Python

Hi I am trying to upload a large csv file but I am getting the below error:您好,我正在尝试上传一个大的 csv 文件,但出现以下错误:

HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url: /upload/storage/v1/b/de-bucket-my-stg/o?uploadType=resumable&upload_id=ADPycdsyu6gSlyfklixvDgL7RLpAQAg6REm9j1ICarKvmdif3tASOl9MaqjQIZ5dHWpTeWqs2HCsL4hoqfrtVQAH1WpfYrp4sFRn (Caused by SSLError(SSLWantWriteError(3, 'The operation did not complete (write) (_ssl.c:2396)'))) HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url: /upload/storage/v1/b/de-bucket-my-stg/o?uploadType=resumable&upload_id=ADPycdsyu6gSlyfklixvDgL7RLpAQAg6REm9j1ICarKvmdif3tASOl9MaqjQIZ5dHWpTeWqs2HCsL4hoqfrtVQAH1WpfYrp4sFRn (Caused by SSLError(SSLWantWriteError(3, '操作未完成(写入)(_ssl.c:2396)')))

Can someone help me on this?有人可以帮我吗?

Below is my code for it:下面是我的代码:

   import os
    import pandas as pd
    import io
    import requests
    from google.cloud import storage
    
    try:
        url = "https://cb-test-dataset.s3.ap-south-1.amazonaws.com/analytics/analytics.csv"
        cont = requests.get(url).content
        file_to_upload = pd.read_csv(io.StringIO(cont.decode('utf-8')))
    except Exception as e:
        print('Error getting file: ' +  str(e))
    
    try:
        os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'C:/Users/haris/Desktop/de-project/xxx.json' --xxx is replaced here.
        storage_client = storage.Client()
        bucket_name = storage_client.get_bucket('de-bucket-my-stg')
        blob = bucket_name.blob('analytics.csv')
        blob.upload_from_string(file_to_upload.to_csv(),'text/csv')
    except Exception as e:
        print('Error uploading file: ' +  str(e))

As mentioned in the documentation ,文档中所述,

My recommendation is to gzip your file before sending it.我的建议是在发送文件之前对文件进行 gzip 压缩。 Text file has an high compression rate (up to 100 times) and you can ingest gzip files directly into BigQuery without unzipped them文本文件具有高压缩率(高达 100 倍),您可以将gzip 文件直接提取到 BigQuery 中而无需解压缩它们

The fastest method of uploading to Cloud Storage is to use the compose API and composite objects.上传到 Cloud Storage 最快的方法是使用compose API 和复合对象。

For more information, you can refer to the stackoverflow thread where OP is facing a similar error.有关详细信息,您可以参考 OP 面临类似错误的stackoverflow 线程

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将文件上传到 Python 3 上的 Google Cloud Storage? - How to upload a file to Google Cloud Storage on Python 3? 如何 stream 将 CSV 数据上传到谷歌云存储(Python) - How to stream upload CSV data to Google Cloud Storage (Python) Google Storage // Cloud Function // Python 修改Bucket中的CSV文件 - Google Storage // Cloud Function // Python Modify CSV file in the Bucket 如何使用 Python API 在 Google Cloud Storage 上上传文件夹 - How to upload folder on Google Cloud Storage using Python API 如何使用云函数读取云存储中的 json 文件 - python - How to read json file in cloud storage using cloud functions - python Firebase 云存储上传文件路径错误 - Firebase Cloud Storage Upload File Path Error Firebase 特定文件上传时的云存储触发器 - Firebase Cloud Storage trigger on Specific File Upload 使用 ElementTree (python) 编辑后将修改后的 XML 文件上传到谷歌云存储 - Upload a modified XML file to google cloud storage after editting it with ElementTree (python) 如何使用 Logstash 将 Elasticsearch 索引作为 CSV 文件导出到 Google Cloud Storage - How to export Elasticsearch Index as CSV file to Google Cloud Storage Using Logstash NextJS 文件上传到谷歌云存储成功但总是上传 0 字节文件 - 使用强大 - NextJS file upload to Google Cloud Storage successful but always uploading 0 byte file - using formidable
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM