I am reading a file from s3 in pandas.
aws_credentials = {
"key": "xxxx",
"secret": "xxxx"
}
# Read data from S3
df_aln = pd.read_csv("s3://dir/ABC/fname_0521.csv", storage_options=aws_credentials, encoding='latin-1')
However, I have several files with same shape and similar naming convention fname_mmyy
. How do I read all the files that match the naming pattern and combine them into one pandas DataFrame?
I'd prefer to not write pd.read_csv
to read each file separately.
According to this answer: https://stackoverflow.com/a/69568591/687896 , you can use glob on S3. Your pattern would be something like fname_*.csv
:
# get the list of CSV files (from cited answer):
import s3fs
s3 = s3fs.S3FileSystem(anon=False)
csvs = s3.glob('your/s3/path/to/fname*.csv')
# read them into pandas + concat the dfs
dfs = []
for csv in csvs:
df = pandas.read_csv(csv)
dfs.append(df)
df = pandas.concat(dfs)
That (or something along those lines) should work.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.