简体   繁体   English

计算 S3 存储桶中的键

[英]Counting keys in an S3 bucket

Using the boto3 library and python code below, I can iterate through S3 buckets and prefixes, printing out the prefix name and key name as follows:使用下面的 boto3 库和 python 代码,我可以遍历 S3 存储桶和前缀,打印出前缀名称和键名称如下:

import boto3
client = boto3.client('s3')

pfx_paginator = client.get_paginator('list_objects_v2')
pfx_iterator = pfx_paginator.paginate(Bucket='app_folders', Delimiter='/')
for prefix in pfx_iterator.search('CommonPrefixes'):
    print(prefix['Prefix'])

    key_paginator = client.get_paginator('list_objects_v2')
    key_iterator = key_paginator.paginate(Bucket='app_folders', Prefix=prefix['Prefix'])
    for key in key_iterator.search('Contents'):
        print(key['Key'])

Inside the key loop, I can put in a counter to count the number of keys (files), but this is an expensive operation.在密钥循环内部,我可以放入一个计数器来计算密钥(文件)的数量,但这是一个昂贵的操作。 Is there a way to make one call given a bucket name and a prefix and return the count of keys contained in that prefix (even if it is more than 1000)?有没有办法在给定存储桶名称和前缀的情况下进行一次调用并返回该前缀中包含的键的计数(即使它超过 1000)?

UPDATE: I found a post here that shows a way to do this with the AWS CLI as follows:更新:我在这里找到了一篇文章其中展示了一种使用 AWS CLI 执行此操作的方法,如下所示:

aws s3api list-objects --bucket BUCKETNAME --prefix "folder/subfolder/" --output json --query "[length(Contents[])]"

Is there a way to do something similar with the boto3 API?有没有办法用 boto3 API 做类似的事情?

You can do it using MaxKeys=1000 parameter.您可以使用MaxKeys=1000参数来完成。 For your case:对于您的情况:

pfx_iterator = pfx_paginator.paginate(Bucket='app_folders', Delimiter='/', MaxKeys=1000)

In general:一般来说:

response = client.list_objects_v2(
    Bucket='string',
    Delimiter='string',
    EncodingType='url',
    MaxKeys=123,
    Prefix='string',
    ContinuationToken='string',
    FetchOwner=True|False,
    StartAfter='string',
    RequestPayer='requester'
)

It will be cheaper for you in 1000 times :) Documentation here它会便宜 1000 倍 :) 文档在这里

Using aws cli it is easy to count :使用 aws cli 很容易计算:

aws s3 ls  <folder url> --recursive --summarize | grep <comment>

eg,例如,

aws s3 ls  s3://abc/ --recursive --summarize | grep "Number of Objects"

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用Python列出IBM COS S3存储桶中的所有键 - Listing all keys in a IBM COS S3 bucket using Python 返回在过去24小时内修改过的Amazon S3存储桶中的所有密钥 - Return all keys from an Amazon S3 bucket which have been modified in the past 24 hours 无法使用python boto提取具有某些前缀的amzon S3存储桶中的密钥 - Not able to extract keys in a amzon S3 bucket having some prefix using python boto 演示如何使用boto3从S3存储桶中删除特定的一个或多个键 - Demonstrate how to delete a particular key or keys from an S3 bucket using boto3 Python:获取Amazon S3存储桶中的前100个最新密钥 - Python: getting the first 100 most recent keys in amazon s3 bucket 如何通过Boto3获取存储桶中所有键的s3元数据 - How to get s3 metadata for all keys in a bucket via boto3 使用python删除三天前的s3桶中的所有键 - Delete all keys in a bucket of s3 which are three days old using python 在python中检查同一个存储桶中是否存在多个s3键的最有效方法是什么? - What is the most efficient way in python of checking the existence of multiple s3 keys in the same bucket? 从S3存储桶下载密钥/文件时出现异常错误-[Errno 1]不允许操作 - Peculiar error downloading keys/files from S3 bucket - [Errno 1] Operation not permitted 无法从Python中的S3存储桶下载图像/代码已具有访问键 - Unable to Download Images from S3 Bucket in Python/Code has Access Keys already
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM