[英]How can I access s3 files in Python using urls?
I want to write a Python script that will read and write files from s3 using their url's, eg:'s3:/mybucket/file'.我想编写一个 Python 脚本,该脚本将使用 s3 的 url 读取和写入文件,例如:'s3:/mybucket/file'。 It would need to run locally and in the cloud without any code changes.
它需要在本地和云端运行,无需任何代码更改。 Is there a way to do this?
有没有办法做到这一点?
Edit: There are some good suggestions here but what I really want is something that allows me to do this:编辑:这里有一些很好的建议,但我真正想要的是允许我这样做的东西:
myfile = open("s3://mybucket/file", "r")
and then use that file object like any other file object.然后像使用任何其他文件对象一样使用该文件对象。 That would be really cool.
那真的很酷。 I might just write something like this for myself if it doesn't exist.
如果它不存在,我可能只是为自己写这样的东西。 I could build that abstraction layer on simples3 or boto.
我可以在 simples3 或 boto 上构建抽象层。
For opening, it should be as simple as:对于打开,它应该像这样简单:
import urllib
opener = urllib.URLopener()
myurl = "https://s3.amazonaws.com/skyl/fake.xyz"
myfile = opener.open(myurl)
This will work with s3 if the file is public.如果文件是公开的,这将适用于 s3。
To write a file using boto, it goes a little something like this:要使用 boto 编写文件,它有点像这样:
from boto.s3.connection import S3Connection
conn = S3Connection(AWS_KEY, AWS_SECRET)
bucket = conn.get_bucket(BUCKET)
destination = bucket.new_key()
destination.name = filename
destination.set_contents_from_file(myfile)
destination.make_public()
lemme know if this works for you :)让我知道这是否适合你:)
Here's how they do it in awscli : 以下是他们在awscli中的做法:
def find_bucket_key(s3_path):
"""
This is a helper function that given an s3 path such that the path is of
the form: bucket/key
It will return the bucket and the key represented by the s3 path
"""
s3_components = s3_path.split('/')
bucket = s3_components[0]
s3_key = ""
if len(s3_components) > 1:
s3_key = '/'.join(s3_components[1:])
return bucket, s3_key
def split_s3_bucket_key(s3_path):
"""Split s3 path into bucket and key prefix.
This will also handle the s3:// prefix.
:return: Tuple of ('bucketname', 'keyname')
"""
if s3_path.startswith('s3://'):
s3_path = s3_path[5:]
return find_bucket_key(s3_path)
Which you could just use with code like this您可以将其与这样的代码一起使用
from awscli.customizations.s3.utils import split_s3_bucket_key
import boto3
client = boto3.client('s3')
bucket_name, key_name = split_s3_bucket_key(
's3://example-bucket-name/path/to/example.txt')
response = client.get_object(Bucket=bucket_name, Key=key_name)
This doesn't address the goal of interacting with an s3 key as a file like object but it's a step in that direction.这并没有解决将 s3 密钥作为类似文件的对象进行交互的目标,但这是朝着这个方向迈出的一步。
I haven't seen something that would work directly with S3 urls, but you could use an S3 access library ( simples3 looks decent) and some simple string manipulation:我还没有看到可以直接使用 S3 url 的东西,但是您可以使用S3 访问库( simples3看起来不错)和一些简单的字符串操作:
>>> url = "s3:/bucket/path/"
>>> _, path = url.split(":", 1)
>>> path = path.lstrip("/")
>>> bucket, path = path.split("/", 1)
>>> print bucket
'bucket'
>>> print path
'path/'
You can use Boto Python API for accessing S3 by python.您可以使用Boto Python API通过 python 访问 S3。 Its a good library.
它是一个很好的图书馆。 After you do the installation of Boto, following sample programe will work for you
安装 Boto 后,以下示例程序将为您工作
>>> k = Key(b)
>>> k.key = 'yourfile'
>>> k.set_contents_from_filename('yourfile.txt')
You can find more information here http://boto.cloudhackers.com/s3_tut.html#storing-data您可以在此处找到更多信息http://boto.cloudhackers.com/s3_tut.html#storing-data
http://s3tools.org/s3cmd works pretty well and support the s3:// form of the URL structure you want. http://s3tools.org/s3cmd运行良好,并支持您想要的 URL 结构的 s3:// 形式。 It does the business on Linux and Windows.
它在 Linux 和 Windows 上开展业务。 If you need a native API to call from within a python program then http://code.google.com/p/boto/ is a better choice.
如果您需要从 Python 程序中调用本机 API,那么http://code.google.com/p/boto/是更好的选择。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.