i am working on amazon s3 bucket. And i need to find a size of the folder inside a bucket through the code. I'm not finding any method to find the size of the folder directly. So is there any other way to achieve this function?
EDIT : I'm aware that there is nothing called folders in s3 bucket. But i need to find the size of all files looking like a folder folder structure. That is, if the structure is like this, https://s3.amazonaws.com/****/uploads/storeeoll48jipuvjbqufcap3p6on6er2bwsufv5ojzqnbe01xvw0fy58x65.png
then i need to find the size of all files with the structure, https://s3.amazonaws.com/****/uploads/...
if you would want to use boto in python here is a small script that you may try:
import boto
conn=boto.connect_s3('api_key','api_secret')
bucket=conn.get_bucket('bucketname');
keys=bucket.list('path')
size=0
for key in keys:
size+= key.size
print size
From AwsConsoleApp.java AWS SDK sample:
List<Bucket> buckets = s3.listBuckets();
long totalSize = 0;
int totalItems = 0;
for (Bucket bucket : buckets)
{
ObjectListing objects = s3.listObjects(bucket.getName());
do {
for (S3ObjectSummary objectSummary : objects.getObjectSummaries()) {
totalSize += objectSummary.getSize();
totalItems++;
}
objects = s3.listNextBatchOfObjects(objects);
} while (objects.isTruncated());
System.out.println("You have " + buckets.size() + " Amazon S3 bucket(s), " +
"containing " + totalItems + " objects with a total size of " + totalSize + " bytes.");
}
There is nothing called "folders" in an S3, it is a flat file system. The file names(bucket keys) may contain slashes (/), and various bucket explorers can use this to interpret folder-file structure.
To know the size of a "folder" in S3, you would first have to know the keys of all individual files that contain the substring of that "folder" path. If you bucket contains millions of files, this would be a very costly operation.
Some S3 explorers do that automatically. I use Cloudberry explorer for S3.
Folders don't really exist in S3.
An object with a Key of subfolder/myfile.txt
is displayed by the software as being in the subfolder
folder. But its only a display thing, the folder doesn't really exist. If you want to find out how many items are in that 'folder' programatically, loop through all the objects that start with subfolder/ get their size and add it up. Alternatively Check out S3Browser which gives you the size on right click.
Here's how to do it with boto3:
import boto3
bucketName = '<bucketname>'
client = boto3.client('s3')
def get_all_objects_in_prefix(prefix):
lastkey = ''
while True:
response = client.list_objects(
Bucket=bucketName,
Prefix=prefix,
Marker=lastkey,
MaxKeys=1000
)
if not response.get('Contents'):
break
lastkey = [item['Key'] for item in response['Contents']][-1]
for item in response['Contents']:
yield item
def get_filesize_of_prefix(prefix):
size = 0
for item in get_all_objects_in_prefix(prefix):
size += item['Size']
return size
here is how I did with boto3
the function that return directory (key) size in MB from bucket
s3_client = client('s3')
def get_s3_folder_size_mb(bucket,prefix):
len = 0
s3_result = s3_client.list_objects_v2(Bucket=bucket, Prefix=prefix)
for key in s3_result['Contents']:
len+=key['Size']
while s3_result['IsTruncated']:
continuation_key = s3_result['NextContinuationToken']
s3_result = s3_client.list_objects_v2(Bucket=bucket, Prefix=prefix, ContinuationToken=continuation_key)
for key in s3_result['Contents']:
len+=key['Size']
return len/1024/1024
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.