简体   繁体   中英

How to consume blob file?

I am having an issue, I am uploading xlsx files to google storage. When I want to reuse them later on I obtain a blob file.

After that I am loss on how to use the actual xlsx file.

from google.cloud import storage

import openpyxl

client = storage.Client()
new_bucket = client.get_bucket('bucket.appspot.com')

#get blob object:
o = new_bucket.get_blob('old_version.xlsx')

# <Blob: blobstorage.appspot.com, old_version.xlsx, 16372393787851916>

#download the object

bytes_version = o.download_as_bytes()

#load it to openpyxl library
wb = load_workbook(filename = bytes_version ,data_only=True)

InvalidFileException: openpyxl does not support b'.xmlpk\x05\x06\x00\x00\x00\x00:\x00:\x00n\x10\x00\x00\xa6\x06\x01\x00\x00\x00' file format, please check you can open it with Excel first. Supported formats are: .xlsx,.xlsm,.xltx,.xltm

End goal would be to download the file as object and read them with openpyxl library (it work with the original file but after the storage on cloud didn't find way to get my xlsx file).

Thank for the help !

edit: adding current code

It should be as simple as (assuming Python3):

import io  # Python3
wb = load_workbook(io.BytesIO(bytes_version))

Your code is reading the Cloud Storage blob into memory:

bytes_version = o.download_as_bytes()

And then trying to load the workbook from memory:

wb = load_workbook(filename = bytes_version ,data_only=True)

However, the load_workbook() method expects a filename or a file-like object. Using a byte string with the file contents is not supported .

openpyxl.reader.excel.load_workbook(filename, read_only=False, keep_vba=False, data_only=False, keep_links=True)

Parameters:

 filename (string or a file-like object open in binary mode c.f., zipfile.ZipFile) – the path to open or a file-like object

Documentation

Solution:

Save the Cloud Storage blob to a local disk file first and then specify the file name in the call to load_workbook() :

blob.download_to_filename('/path/to/file')
wb = load_workbook(filename = '/path/to/file' ,data_only=True)

Note: Replace /path/to/file with a real path on your system and with the .xlsx file extension.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM