I would like to read/write files in Google Cloud Storage bucket with Python.
Suppose I have a folder in gs://my_project/data
.
How to list the folders and files in the above folder?
How to read and write files?
There are several ways to perform these operations. The most common one is to use the native Google Cloud Storage API for Python .
In particular, step 0 to use this API is to set up authentication to GCP, which consists in setting up a service account, downloading its json
credentials and set an environment variable pointing to it:
export GOOGLE_APPLICATION_CREDENTIALS="[PATH-TO-JSON-CREDS]"
In GCS, there is no notion of a "directory"/"folder" . There are only buckets and blobs/objects. Nevertheless, the /
in blob names are usable to emulate a folder-like hierarchy.
To list blobs from gs://my_project/data
:
from google.cloud import storage
client = storage.Client()
bucket = client.bucket('my_project')
blobs = list(bucket.list_blobs(prefix='data/'))
To read from the first blob listed in gs://my_project/data
.
target_blob = blobs[0]
# read as string
read_output = target_blob.download_as_string()
To write to new blob, I have found no other way than to write to a local file and upload from file.
target_blob = bucket.blob('new_blob.txt')
local_tmp_path = 'tmp.txt'
# write string
with open(local_tmp_path, 'w') as f:
f.write('Hello World')
with open(local_tmp_path, 'r') as f:
target_blob.upload_from_file(f)
In order to list/read files, the code that @syltruong suggested didn't work for me for some permission error. I had to change the code to
storage_client = storage.Client.from_service_account_json('./path_to_json')
bucket = storage_client.bucket(bucketname)
blobs = list(bucket.list_blobs(prefix='data/'))
which worked fine.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.