简体   繁体   中英

Python script that moves specific files between S3 buckets

So I'm still a rookie when it comes to coding in Python, but I was wondering if someone could be so kind as to help me with a problem.

A client I work for uses the eDiscovery system Venio. They have a web, app,database, and linux server running off of EC2 instances in AWS.

Right now when customers upload docs to their server, they end up re downloading the content to another drive, causing extra work for themselves. There is also an issue of speed when it comes to serving up files on their system.

After setting up automated snapshots with a script in Lambda, I started thinking that storing their massive files in S3,behind CloudFront might be a better way to go.

Does anyone know if there is a way to make a Python script that looks for key words in a file(ex;"Use", "Discard"), and separates them into different buckets automatically?

Any advice would be immensely appreciated!

UPDATE:

So here is a script I started:

import boto3

# Creates S3 client
s3 = boto3.client('s3')

filename = 'file.txt'
bucket_name = 'responsive-bucket'

keyword_bucket = {
    'use': 'responsive-bucket',
    'discard': 'non-responsive-bucket',
}

Essentially what I want is when a client uploads a file through the web API, a python script triggers which looks for the keywords of Responsive or Non-Responsive. Once it recognizes those keys, it PUTS those files into the corresponding named buckets. The responsive files will stay in a standard s3 buckets and the non useful ones will go to a s3-IA bucket. After a set time, they are then lifecycle to Glacier.

Any help would be amazing!!!

If you can build a mapping of keywords => bucket names , you could use a dictionary. For example:

keyword_bucket = {
    'use': 'bucket_abc',
    'discard': 'bucket_xyz',
    'etc': 'bucket_whatever'
}

Then you open the file and search for your keywords. When a keyword matches, you use the dictionary above to find the correspondent bucket where the file should go.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM