简体   繁体   中英

Size of folder in s3 bucket

i am working on amazon s3 bucket. And i need to find a size of the folder inside a bucket through the code. I'm not finding any method to find the size of the folder directly. So is there any other way to achieve this function?

EDIT : I'm aware that there is nothing called folders in s3 bucket. But i need to find the size of all files looking like a folder folder structure. That is, if the structure is like this, https://s3.amazonaws.com/****/uploads/storeeoll48jipuvjbqufcap3p6on6er2bwsufv5ojzqnbe01xvw0fy58x65.png then i need to find the size of all files with the structure, https://s3.amazonaws.com/****/uploads/...

if you would want to use boto in python here is a small script that you may try:

import boto
conn=boto.connect_s3('api_key','api_secret')
bucket=conn.get_bucket('bucketname');
keys=bucket.list('path')
size=0
for key in keys:
        size+= key.size
print size

From AwsConsoleApp.java AWS SDK sample:

List<Bucket> buckets = s3.listBuckets();
long totalSize  = 0;
int  totalItems = 0;
for (Bucket bucket : buckets)
{
    ObjectListing objects = s3.listObjects(bucket.getName());
    do {
        for (S3ObjectSummary objectSummary : objects.getObjectSummaries()) {
            totalSize += objectSummary.getSize();
            totalItems++;
        }
        objects = s3.listNextBatchOfObjects(objects);
    } while (objects.isTruncated());
    System.out.println("You have " + buckets.size() + " Amazon S3 bucket(s), " +
                    "containing " + totalItems + " objects with a total size of " + totalSize + " bytes.");
}

There is nothing called "folders" in an S3, it is a flat file system. The file names(bucket keys) may contain slashes (/), and various bucket explorers can use this to interpret folder-file structure.

To know the size of a "folder" in S3, you would first have to know the keys of all individual files that contain the substring of that "folder" path. If you bucket contains millions of files, this would be a very costly operation.

Some S3 explorers do that automatically. I use Cloudberry explorer for S3.

Folders don't really exist in S3.

An object with a Key of subfolder/myfile.txt is displayed by the software as being in the subfolder folder. But its only a display thing, the folder doesn't really exist. If you want to find out how many items are in that 'folder' programatically, loop through all the objects that start with subfolder/ get their size and add it up. Alternatively Check out S3Browser which gives you the size on right click.

Here's how to do it with boto3:

import boto3

bucketName = '<bucketname>'
client = boto3.client('s3')

def get_all_objects_in_prefix(prefix):
    lastkey = ''
    while True:
        response = client.list_objects(
            Bucket=bucketName,
            Prefix=prefix,
            Marker=lastkey,
            MaxKeys=1000
        )
        if not response.get('Contents'):
            break
        lastkey = [item['Key'] for item in response['Contents']][-1]
        for item in response['Contents']:
            yield item

def get_filesize_of_prefix(prefix):
    size = 0
    for item in get_all_objects_in_prefix(prefix):
        size += item['Size']

    return size

here is how I did with boto3

the function that return directory (key) size in MB from bucket

s3_client   = client('s3')  
def get_s3_folder_size_mb(bucket,prefix):
    len = 0
    s3_result =  s3_client.list_objects_v2(Bucket=bucket, Prefix=prefix)
    for key in s3_result['Contents']:
        len+=key['Size'] 
        while s3_result['IsTruncated']:
            continuation_key = s3_result['NextContinuationToken']
            s3_result = s3_client.list_objects_v2(Bucket=bucket, Prefix=prefix, ContinuationToken=continuation_key)
            for key in s3_result['Contents']:
                len+=key['Size']
    return len/1024/1024

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM