如何使用 url 访问 Python 中的 s3 文件？

Question

我想编写一个 Python 脚本，该脚本将使用 s3 的 url 读取和写入文件，例如：'s3:/mybucket/file'。 它需要在本地和云端运行，无需任何代码更改。 有没有办法做到这一点？

编辑：这里有一些很好的建议，但我真正想要的是允许我这样做的东西：

 myfile = open("s3://mybucket/file", "r")

然后像使用任何其他文件对象一样使用该文件对象。 那真的很酷。 如果它不存在，我可能只是为自己写这样的东西。 我可以在 simples3 或 boto 上构建抽象层。

Answer 1

对于打开，它应该像这样简单：

import urllib
opener = urllib.URLopener()
myurl = "https://s3.amazonaws.com/skyl/fake.xyz"
myfile = opener.open(myurl)

如果文件是公开的，这将适用于 s3。

要使用 boto 编写文件，它有点像这样：

from boto.s3.connection import S3Connection
conn = S3Connection(AWS_KEY, AWS_SECRET)
bucket = conn.get_bucket(BUCKET)
destination = bucket.new_key()
destination.name = filename
destination.set_contents_from_file(myfile)
destination.make_public()

让我知道这是否适合你:)

Answer 2

以下是他们在awscli中的做法：

def find_bucket_key(s3_path):
    """
    This is a helper function that given an s3 path such that the path is of
    the form: bucket/key
    It will return the bucket and the key represented by the s3 path
    """
    s3_components = s3_path.split('/')
    bucket = s3_components[0]
    s3_key = ""
    if len(s3_components) > 1:
        s3_key = '/'.join(s3_components[1:])
    return bucket, s3_key


def split_s3_bucket_key(s3_path):
    """Split s3 path into bucket and key prefix.
    This will also handle the s3:// prefix.
    :return: Tuple of ('bucketname', 'keyname')
    """
    if s3_path.startswith('s3://'):
        s3_path = s3_path[5:]
    return find_bucket_key(s3_path)

您可以将其与这样的代码一起使用

from awscli.customizations.s3.utils import split_s3_bucket_key
import boto3
client = boto3.client('s3')
bucket_name, key_name = split_s3_bucket_key(
    's3://example-bucket-name/path/to/example.txt')
response = client.get_object(Bucket=bucket_name, Key=key_name)

这并没有解决将 s3 密钥作为类似文件的对象进行交互的目标，但这是朝着这个方向迈出的一步。

Answer 3

我还没有看到可以直接使用 S3 url 的东西，但是您可以使用S3 访问库（ simples3看起来不错）和一些简单的字符串操作：

>>> url = "s3:/bucket/path/"
>>> _, path = url.split(":", 1)
>>> path = path.lstrip("/")
>>> bucket, path = path.split("/", 1)
>>> print bucket
'bucket'
>>> print path
'path/'

Answer 4

试试s3fs

文档上的第一个示例：

>>> import s3fs
>>> fs = s3fs.S3FileSystem(anon=True)
>>> fs.ls('my-bucket')
['my-file.txt']
>>> with fs.open('my-bucket/my-file.txt', 'rb') as f:
...     print(f.read())
b'Hello, world'

Answer 5

您可以使用Boto Python API通过 python 访问 S3。 它是一个很好的图书馆。 安装 Boto 后，以下示例程序将为您工作

>>> k = Key(b)
>>> k.key = 'yourfile'
>>> k.set_contents_from_filename('yourfile.txt')

您可以在此处找到更多信息http://boto.cloudhackers.com/s3_tut.html#storing-data

Answer 6

http://s3tools.org/s3cmd运行良好，并支持您想要的 URL 结构的 s3:// 形式。 它在 Linux 和 Windows 上开展业务。 如果您需要从 Python 程序中调用本机 API，那么http://code.google.com/p/boto/是更好的选择。

如何使用 url 访问 Python 中的 s3 文件？

问题描述

6 个解决方案

解决方案1
15 2011-02-15 23:25:17

解决方案2
10 2017-06-01 20:53:38

解决方案3
3 2011-02-14 15:11:50

解决方案4
2 2019-08-12 19:20:15

解决方案5
1 2011-02-14 15:27:03

解决方案6
1 2011-02-14 17:17:03

如何使用 url 访问 Python 中的 s3 文件？

问题描述

6 个解决方案

解决方案1 15 2011-02-15 23:25:17

解决方案2 10 2017-06-01 20:53:38

解决方案3 3 2011-02-14 15:11:50

解决方案4 2 2019-08-12 19:20:15

解决方案5 1 2011-02-14 15:27:03

解决方案6 1 2011-02-14 17:17:03

解决方案1
15 2011-02-15 23:25:17

解决方案2
10 2017-06-01 20:53:38

解决方案3
3 2011-02-14 15:11:50

解决方案4
2 2019-08-12 19:20:15

解决方案5
1 2011-02-14 15:27:03

解决方案6
1 2011-02-14 17:17:03