简体   繁体   中英

Moving file based on filename with Amazon S3

Need to move Amazon S3 files based on their names into appropriate folders, that are named the same . This is for an automation script in python on AWS. The file names increment by 1.

For instance, one would be called my_file_section_a , then the next would be my_file_section_b1 , then the next would be my_file_section_b2 and so forth. The folders would be called my_file_section_a , my_file_section_b1 and so forth.

Here is the code:

from __future__import print_function
import boto3
import time, urllib
import json

print("*"*80)
print("Initializing...")
print("*"*80)

s3 = boto3.client('s3')

def lambda_handler(event, context):

    source_bucket = event['Records'[0]['s3']['bucket']['name']
    object_key = event['Records'][0]['s3']['object']['key']
    target_bucket = 'myfilelambdadestination'
    copy_source = {'Bucket': source_bucket, 'Key': object_key}
    print("Source bucket: ", source_bucket)
    print("Target bucket: ", target_bucket)
    print("Log Stream name: ",context.log_stream_name)
    print("Log Group name: ",context.log_group_name)
    print("Request ID: ",context.aws_request_id)
    print("Mem. limits(MB) ", context.memory_limit_in_mb)
    try:
        print("Using waiter to waiting for object to persist through s3 service")
        waiter = s3.get_waiter('object_exists')
        waiter.wait(Bucket=source_bucket, Key = object_key)
        s3.copy_object(Bucket=target_bucket, Key = object_key, CopySource = copy_source)
        return response['ContentType']
    except Exception as err:
        print("Error -" +str(err))
        return e

How do I have it that the files based on their names are moved to folders which are also named the same?

new folder is file name without the extension. depending on what object_key looks like you may have tweak this code

object_key = 'my_file_section_b2.txt'
new_object_prefix = object_key.rsplit('/', 1)[-1].split('.')[0]
new_object_key = new_object_prefix + '/' + object_key
new_object_key
'my_file_section_b2/my_file_section_b2.txt'

Here is an AWS Lambda function that will copy an object to a directory of the same name, then delete the original object.

It will only copy objects from the root of the bucket, to avoid a situation where the copied object triggers the Lambda function again, causing an infinite loop. (Amazon S3 is big, but not infinite,) If your target is a different bucket. this will not be necessary.

import boto3
import urllib

def lambda_handler(event, context):
    
    # Get the bucket and object key from the Event
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'])
    
    # Only copy objects that were uploaded to the bucket root (to avoid an infinite loop)
    if '/' not in key:
        
        # Copy object
        s3_client = boto3.client('s3')
        s3_client.copy_object(
            Bucket = bucket,
            Key = f"{key}/{key}",
            CopySource= {'Bucket': bucket, 'Key': key}
        )
        
        # Delete source object
        s3_client.delete_object(
            Bucket = bucket,
            Key = key
        )

The function requires an IAM Role with permission for GetObject and PutObject (maybe more?) on the bucket. After creating this AWS Lambda function, create an Event on the Amazon S3 bucket to trigger the function when an object is created.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM