简体   繁体   中英

Saving Pillow Images from PDF to Google Cloud Server

I am working on a Django web app that takes in PDF files and performs some image processing to each page of the PDFs. I am given a PDF and I need to save each page into my Google Cloud Storage. I am using pdf2image 's convert_from_path() to generate a list of Pillow images for each page in the PDF. Now, I want to save these images to Google Cloud Storages but I can't figure it out.

I have successfully saved these Pillow images locally but I do not know how to do this in the cloud.

fullURL = file.pdf.url
client = storage.Client()
bucket = client.get_bucket('name-of-my-bucket')
blob = bucket.blob(file.pdf.name[:-4] + '/')
blob.upload_from_string('', content_type='application/x-www-form-urlencoded;charset=UTF-8')
pages = convert_from_path(fullURL, 400)
for i,page in enumerate(pages):
    blob = bucket.blob(file.pdf.name[:-4] + '/' + str(i) + '.jpg')
    blob.upload_from_string('', content_type='image/jpeg')
    outfile = file.pdf.name[:-4] + '/' + str(i) + '.jpg'
    page.save(outfile)
    of = open(outfile, 'rb')
    blob.upload_from_file(of)

Since you have saved the files locally, then they are available in your local directory where the web app is running.

What you can do simply is to iterate through the files of that directory and upload them to the Google Cloud Storage one by one.

Here is a sample code:

You will need this library:

google-cloud-storage

Python code:

#Libraries
import os
from google.cloud import storage

#Public variable declarations:
bucket_name = "[BUCKET_NAME]"
local_directory = "local/directory/of/the/files/for/uploading/"
bucket_directory = "uploaded/files/" #Where the files will be uploaded in the bucket

#Upload file from source to destination
def upload_blob(source_file_name, destination_blob_name):
    storage_client = storage.Client()
    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)

    blob.upload_from_filename(source_file_name)

#Iterate through all files in that directory and upload one by one using the same filename
def upload_files():
    for filename in os.listdir(local_directory):
        upload_blob(local_directory + filename, bucket_directory + filename)
    return "File uploaded!"

#Call this function in your code:
upload_files()

NOTE: I have tested the code in Google App Engine web app and it worked for me. Take the idea of how it is working and modify it according to your needs. I hope that was helpful.

So start off by not using blobstore. They are trying to get rid of it and get people to use cloud storage. First set up cloud storage

https://cloud.google.com/appengine/docs/standard/python/googlecloudstorageclient/setting-up-cloud-storage

I use webapp2 and not Django but I'm sure you can figure it out. Also I don't use Pillow images so you'll have to open the image that you're going to upload. Then do something like this (this assumes you're trying to post the data):

  import cloudstorage as gcs
  import io
  import StringIO 
  from google.appengine.api import app_identity

before get and post in its own section

     def create_file(self, filename, Dacontents):

    write_retry_params = gcs.RetryParams(backoff_factor=1.1)
    gcs_file = gcs.open(filename,
                        'w',
                        content_type='image/jpeg',
                        options={'x-goog-meta-foo': 'foo',
                                'x-goog-meta-bar': 'bar'},
                        retry_params=write_retry_params)
    gcs_file.write(Dacontents)
    gcs_file.close()

in get in your HTML

   <form action="/(whatever yoururl is)" method="post"enctype="multipart/form-data">
  <input type="file" name="orders"/>
   <input type="submit"/>
    </form>

In Post

    orders=self.request.POST.get(‘orders)#this is for webapp2

    bucket_name = os.environ.get('BUCKET_NAME',app_identity.get_default_gcs_bucket_name())
    bucket = '/' + bucket_name
    OpenOrders=orders.file.read()
    if OpenOrders:
        filename = bucket + '/whateverYouWantToCallIt'            
        self.create_file(filename,OpenOrders)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM