简体   繁体   中英

How do you effectively move and partition files in s3 using boto3?

There are around 10k files in an s3 location which got exported from dynamodb PITR export to s3 option. These files aren't partitioned in any way and it is within a single folder which is a problem for a use case. I want to move all these files within s3 and partition it in a random manner. Say for example I have 100 files and I want to move these files in a batch of 10 and create 10 partitions like this ( partition=1/10files, partition=2/10files,....). How do I do it efficiently using boto3?

  1. Get a list of all the files from the AWS bucket and store it in a list.
  2. Chunk the big list into small sized for size 10. Below is the code for the same.
list_partition_size = 10
data_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 01, 2, 2, 33, 3, 3, 4, 4, 54, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5]
records_list_chunk = [data_list[i:i + list_partition_size] for i in
                      range(0, len(data_list), list_partition_size)]
print records_list_chunk
  1. Add a for loop over the list and you will get chunks of length 10 and you can pass this in a method that will move them.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM