简体   繁体   English

使用Python请求将图像和元数据上载到公共Amazon S3存储桶

[英]Upload images and metadata to public Amazon S3 bucket with Python requests

I know, there's the boto library for Python, however, all I'd like to do is uploading a lot of image files including metadata to a public S3 bucket. 我知道,有用于Python的boto库,但是,我想要做的就是将大量图像文件(包括元数据)上传到公共S3存储桶。 The images should go into various sub-directories inside the bucket. 图像应该进入存储桶内的各个子目录。

With cURL, this is supposed to be working: 使用cURL,这应该是有效的:

curl -v -F "key=test/test.jpg" -F "file=@test.jpg" http://my-public-bucket.s3.amazonaws.com/

So I figure that should be doable with urllib, urllib2 and/or Python requests only. 所以我认为只应该使用urllib,urllib2和/或Python请求。 But how? 但是怎么样? I'm totally new to Amazon S3 ... and cURL. 我对Amazon S3和cURL完全陌生。

Also what's the best way for storing some meta data along with the images? 另外,将一些元数据与图像一起存储的最佳方法是什么? An additional JSON-string file? 一个额外的JSON字符串文件?

Using boto (version 2.6.0) you'd do it like this: 使用boto (版本2.6.0)你会这样做:

import boto

connection = boto.connect_s3()
bucket = connection.get_bucket('mybucket')
key = bucket.new_key('myimage.jpg')
key.set_contents_from_filename('myimage.jpg')
key.set_metadata(...)

Make sure you've got the credentials in the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY . 确保您已获得环境变量AWS_ACCESS_KEY_IDAWS_SECRET_ACCESS_KEY的凭据。

That's it. 而已。

Works with Python Requests only: 仅适用于Python请求:

import requests
r = requests.post('my_public_bucket', files={'file': open('/path/test.txt', 'rb')}, data={'key': 'test/test.txt'})

Your cURL string translates into roughly the following: 您的cURL字符串大致转换为以下内容:

import requests

url = 'http://my-public-bucket.s3.amazonaws.com/'
files = {
    'key': ('', 'test/test.jpg'),
    'file': open('test.jpg', 'rb'),
}

r = requests.post(url, files=files)

The general form of Requests' multipart upload syntax is found in this StackOverflow answer . 此StackOverflow答案中提供了请求的分段上传语法的一般形式。

To upload to a signed url and requests I had to do this: 要上传到已签名的网址并请求我必须这样做:

with open('photo_1.jpg', 'rb') as content_file:
    content = content_file.read()
result = requests.put(url=upload_url, headers={}, data=content)

This is bad because it loads everything into memory, but it should get you past the initial hump. 这很糟糕,因为它将所有内容加载到内存中,但它应该让你超过最初的驼峰。

Also when using curl I had to use the a different option: 另外,当使用curl时,我必须使用不同的选项:

curl -X PUT --upload-file photo_1.jpg <url>

Note: When I created the url at my server with boto I set headers=None so that headers would not be an issue. 注意:当我使用boto在我的服务器上创建url时,我设置headers = None,这样头文件就不会成为问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM