[英]How to move files in Google Cloud Storage from one bucket to another bucket by Python
Are there any API function that allow us to move files in Google Cloud Storage from one bucket in another bucket?是否有任何 API function 允许我们将 Google Cloud Storage 中的文件从一个存储桶移动到另一个存储桶?
The scenario is we want Python to move read files in A bucket to B bucket.场景是我们希望 Python 将 A 桶中的读取文件移动到 B 桶。 I knew that gsutil could do that but not sure Python can support that or not.
我知道 gsutil 可以做到这一点,但不确定 Python 是否支持。
Thanks.谢谢。
Here's a function I use when moving blobs between directories within the same bucket or to a different bucket.这是我在同一存储桶内的目录之间移动 blob 或移动到不同存储桶时使用的函数。
from google.cloud import storage
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="path_to_your_creds.json"
def mv_blob(bucket_name, blob_name, new_bucket_name, new_blob_name):
"""
Function for moving files between directories or buckets. it will use GCP's copy
function then delete the blob from the old location.
inputs
-----
bucket_name: name of bucket
blob_name: str, name of file
ex. 'data/some_location/file_name'
new_bucket_name: name of bucket (can be same as original if we're just moving around directories)
new_blob_name: str, name of file in new directory in target bucket
ex. 'data/destination/file_name'
"""
storage_client = storage.Client()
source_bucket = storage_client.get_bucket(bucket_name)
source_blob = source_bucket.blob(blob_name)
destination_bucket = storage_client.get_bucket(new_bucket_name)
# copy to new destination
new_blob = source_bucket.copy_blob(
source_blob, destination_bucket, new_blob_name)
# delete in old destination
source_blob.delete()
print(f'File moved from {source_blob} to {new_blob_name}')
Using the google-api-python-client , there is an example on the storage.objects.copy page.使用google-api-python-client , storage.objects.copy页面上有一个示例。 After you copy, you can delete the source with storage.objects.delete .
复制后,您可以使用storage.objects.delete删除源。
destination_object_resource = {}
req = client.objects().copy(
sourceBucket=bucket1,
sourceObject=old_object,
destinationBucket=bucket2,
destinationObject=new_object,
body=destination_object_resource)
resp = req.execute()
print json.dumps(resp, indent=2)
client.objects().delete(
bucket=bucket1,
object=old_object).execute()
you can use GCS Client Library Functions documented at [1] to read to one bucket and write to the other and then delete source file.您可以使用 [1] 中记录的 GCS 客户端库函数读取一个存储桶并写入另一个存储桶,然后删除源文件。
You can even use the GCS REST API documented at [2].您甚至可以使用 [2] 中记录的 GCS REST API。
Link:关联:
[1] - https://developers.google.com/appengine/docs/python/googlecloudstorageclient/functions [1] - https://developers.google.com/appengine/docs/python/googlecloudstorageclient/functions
[2] - https://developers.google.com/storage/docs/concepts-techniques#overview [2] - https://developers.google.com/storage/docs/concepts-techniques#overview
def GCP_BUCKET_A_TO_B():
source_bucket = storage_client.get_bucket("Bucket_A_Name")
filename = [filename.name for filename in
list(source_bucket.list_blobs(prefix=""))]
for i in range (0,len(filename)):
source_blob = source_bucket.blob(filename[i])
destination_bucket = storage_client.get_bucket("Bucket_B_Name")
new_blob = source_bucket.copy_blob(
source_blob, destination_bucket, filename[i])
I just wanted to point out that there's another possible approach and that is using gsutil
through the use of the subprocess
module.我只是想指出还有另一种可能的方法,那就是通过使用
subprocess
模块来使用gsutil
。
The advantages of using gsutil
like that:像这样使用
gsutil
的优点:
The disadvantages:缺点:
Example:例子:
def move(source_uri: str,
destination_uri: str) -> None:
"""
Move file from source_uri to destination_uri.
:param source_uri: gs:// - like uri of the source file/directory
:param destination_uri: gs:// - like uri of the destination file/directory
:return: None
"""
cmd = f"gsutil -m mv {source_uri} {destination_uri}"
subprocess.run(cmd)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.