简体   繁体   中英

Extract only file names from an Amazon S3 bucket

I have a requirement to extract only file names from an Amazon S3 bucket without those extra 3 zeros after .csv , I'm doing that like this

# remove files so every time you have new names
rm ListOfFiles.txt

# get file names
aws s3 ls <bucket-address-directory-path> | awk '{print $4}' | sed 's/.csv000*/.csv/g' >> ListOfFiles.txt

I'm getting all those file names but there is a blank line at the top as directory there is a Folder. I don't need that folder, neither the blank line.

What in S3

Archive
ABC.csv000
BCD.csv000
DEF.csv000

What I'm getting

<a blank line here>
ABC.csv
BCD.csv
DEF.csv

What I need

ABC.csv
BCD.csv
DEF.csv

Combine awk and sed into one command, something like

aws s3 ls <bucket-address-directory-path> | sed -nr 's/.* ([^ ]*.csv)000.*/\1/p'

or

aws s3 ls <bucket-address-directory-path> | awk 'NF>3 { sub(/000$/,"", $4); print $4}'

change "{print $4}" to "{if(NR>1){print $4}}"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM