繁体   English   中英

如何使用 python 将文件夹从本地上传到 GCP 存储桶

[英]How to upload folder from local to GCP bucket using python

我正在关注此链接并收到一些错误:

如何使用 Python API 在 Google Cloud Storage 上上传文件夹

我已将 model 保存在容器环境中,然后我想从那里复制到 GCP 存储桶。

这是我的代码:

storage_client = storage.Client(project='*****')
def upload_local_directory_to_gcs(local_path, bucket, gcs_path):

   bucket = storage_client.bucket(bucket)

    assert os.path.isdir(local_path)
    for local_file in glob.glob(local_path + '/**'):
        
        print(local_file)


        
        print("this is bucket",bucket)
        blob = bucket.blob(gcs_path)
        print("here")
        blob.upload_from_filename(local_file)
        print("done")

path="/pythonPackage/trainer/model_mlm_demo" #this is local absolute path where my folder is. Folder name is **model_mlm_demo**
buc="py*****" #this is my GCP bucket address
gcs="model_mlm_demo2/" #this is the new folder that I want to store files in GCP

upload_local_directory_to_gcs(local_path=path, bucket=buc, gcs_path=gcs)

/pythonPackage/trainer/model_mlm_demo配置中有 3 个文件,model.bin 和 arguments.bin`

错误

这些代码不会引发任何错误,但 GCP 存储桶中没有上传任何文件。 它只是创建空文件夹。

在此处输入图像描述

在此处输入图像描述

我可以看到的错误是,您不需要将gs://作为存储桶参数传递。 实际上,这是您可能需要查看的示例,

https://cloud.google.com/storage/docs/uploading-objects#storage-upload-object-python

def upload_blob(bucket_name, source_file_name, destination_blob_name):
    """Uploads a file to the bucket."""
    # The ID of your GCS bucket
    # bucket_name = "your-bucket-name"
    # The path to your file to upload
    # source_file_name = "local/path/to/file"
    # The ID of your GCS object
    # destination_blob_name = "storage-object-name"

    storage_client = storage.Client()
    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)

    blob.upload_from_filename(source_file_name)

    print(
        "File {} uploaded to {}.".format(
            source_file_name, destination_blob_name
        )
    )

我已经重现了您的问题,并且下面的代码片段可以正常工作。 我已经根据您在问题中提到的文件夹和名称更新了代码。 如果您有任何问题,请告诉我。

import os
import glob
from google.cloud import storage
storage_client = storage.Client(project='')

def upload_local_directory_to_gcs(local_path, bucket, gcs_path):

    bucket = storage_client.bucket(bucket)

    assert os.path.isdir(local_path)
    for local_file in glob.glob(local_path + '/**'):

        print(local_file)

        print("this is bucket", bucket)
        filename=local_file.split('/')[-1]
        blob = bucket.blob(gcs_path+filename)
        print("here")
        blob.upload_from_filename(local_file)
        print("done")


# this is local absolute path where my folder is. Folder name is **model_mlm_demo**
path = "/pythonPackage/trainer/model_mlm_demo"
buc = "py*****"  # this is my GCP bucket address
gcs = "model_mlm_demo2/"  # this is the new folder that I want to store files in GCP

upload_local_directory_to_gcs(local_path=path, bucket=buc, gcs_path=gcs)

我想出了一种使用子流程在 GCP 存储桶中上传模型工件的方法。

import subprocess

subprocess.call('gsutil cp -r source_folder_in_local gs://*****/folder_name', shell=True, stdout=subprocess.PIPE)

如果未安装 gsutil。 您可以使用此链接安装:

https://cloud.google.com/storage/docs/gsutil_install

我刚刚遇到了gcsfs库,它似乎也是关于更好的接口

您可以将整个目录复制到 gcs 位置,如下所示:


def upload_to_gcs(src_dir: str, gcs_dst: str):
    fs = gcsfs.GCSFileSystem()
    fs.put(src_dir, model_directory, recursive=True)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM