how to load multiple files in a folder from s3 to Python Notebooks

Question

I have a series of s3 files in one folder on s3, their format looks as below:

aac0202-2121-41.csv
aac0202-2121-42.csv
aac0202-2121-43.csv
aac0202-2121-44.csv
...aac0202-2121-70.csv

They all have the same columns, I am trying to read_csv and aggregate them together.

The file should be a large file combining 41 to 70.

My current code looks like this, is there a more efficient way or better way to do this?

for number in arange(41, 71, 1):
    df = df.concat([df, pd.read_csv('s3://ap/data/tm/aac0202-2121-%s.csv'%number)])
df

I want each file only appears once during the concatenation. so it's just combining 41, 42, 43..until 70.

Answer 1

Try:

df_list = []

for number in arange(41, 71, 1):
    df = pd.read_csv('s3://ap/data/tm/aac0202-2121-%s.csv'%number)
    df_list.append(df)

df_final = pd.concat(df_list)

how to load multiple files in a folder from s3 to Python Notebooks

Question

1 answers

solution1
0 2021-10-04 18:07:08

how to load multiple files in a folder from s3 to Python Notebooks

Question

1 answers

solution1 0 2021-10-04 18:07:08

solution1
0 2021-10-04 18:07:08