简体   繁体   中英

Can S3 ListObjectsV2 return the keys sorted newest to oldest?

I have AWS S3 buckets with hundreds of top-level prefixes (folders). Each prefix contains somewhere between five thousand and a few million files in each prefix - most growing at rate of 10-100k per year. 99% of the time, all I care about are the newest 1-2000 or so in each folder...

Using ListObjectV2 returns me 1000 files and that is the max (setting "MaxKeys" to a higher value still truncates the list at 1000). This would be reasonably fine, however (per the documentation) it's returning me the file list in ascending alphabetical order (which, given my keys/filenames have the date in them effectively results in a oldest->newest sort)... which is considerably less useful than if it returned me the NEWEST files (or reverse-alphabetical).

One option is to do a continuation allowing me to pull the entire prefix, then use the tail end of the entire array of keys as needed... but that would be (most importantly) slow for large 'folders'. A prefix with 2 million files would require 2,000 separate API calls, just to get the newest few-hundred filenames. (not to mention the costs incurred by pulling the entire bucket list even though I'm only really interested in the newest 1-2000 files.)

Is there a way to have the ListObjectV2 call (or any other s3 call) give me the list of the newest (or reverse-alphabetical) files? New files come in every few minutes - and the most important file is THE most recent file, so doing an S3 Inventory doesn't seem like it would do the trick.

(or, perhaps, a call that gives me filenames in a created-by date range...?)

Using javascript - but I'm sure every language has more-or-less the same features when it comes to trying to list objects from an S3 bucket.

Edit: weird idea: If AWS doesn't offer a 'sort' option on a basic API call for one of it's most popular services... Would it make sense to document all the filenames/keys in a dynamo table and query that instead?

No. The ListObjectsV2() will always return up to 1000 objects alphabetically in the requested Prefix.

You could use Amazon S3 Inventory , which can provide a daily or weekly CSV file listing all objects.

If you need real-time or fairly fast access to a list of all available objects, your other option would be to trigger an AWS Lambda function whenever objects are created/deleted. The Lambda function would store/update information in a database (eg DynamoDB) that can provide very fast access to the list of objects. You would need to code this solution.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM