简体   繁体   中英

Read multiple .csv files from folder with variable part fo file name

I have a folder that contains a variable number of files, and each file has a variable string in the name. For example:

my_file V1.csv
my_file V2.csv
my_file something_else.csv

I would need to:

  1. Load all the files which name start with "my_file"
  2. Concatenate all of them in a single dataframe

Right now I am doing it with individual pd.read_csv functions for each file, and then merging them with a concatenate.

This is not optimal as every time the files in the source folder change, I need to modify the script.

Is it possible to automate this process, so that it works even if the source files change?

You can combine glob , pandas.concat and pandas.read_csv fairly easily. Assuming the CSV files are in the same folder as your script:

import glob

import pandas as pd

df = pd.concat([pd.read_csv(f) for f in glob.glob('my_file*.csv')])
for filename in os.listdir(directory):
     if filename.startswith("my_file") and filename.endswith(".csv"): 
         # do some stuff here
         continue
     else:
         continue

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM