How to retrieve similarly named csv files and create dataframes with them

Question

I have multiple csv files with similar names in numeric order (nba_1, nba_2, etc). They are all formatted the same as far as column names and dtypes. Instead of manually pulling each one in individually to a dataframe (nba_1 = pd.read_csv('/nba_1.csv')) is there a way to write a for loop or something like it to pull them in and name them? I think the basic framework would be something like:

for i in range(1, 6):
    nba_i = pd.read_csv('../nba_i.csv')

Beyond that, I do not know the particulars. Once I pull them in I will be performing the same actions on each of them (deleting and formating the same columns) so I would also want to iterate through them there.

Thank you in advance for your help.

Answer 1

I think your real question is how to get all the files into a dataframe
Use pathlib , part of the standard library, to work with your files.
- Python 3's pathlib Module: Taming the File System
Since your csv files are the same, as you stated in the question, it would be more efficient to combine them all into a single dataframe and then clean the data all at once.
- It's less efficient to clean each dataframe separately, and then combine them

To get a single, combined dataframe

from pathlib import Path
import pandas as pd

p = Path(r'c:\some_path_to_files')  # set your path
files = p.glob('nba*.csv')  # find your files

# It was stated, all the files are the same format, so create one dataframe
df = pd.concat([pd.read_csv(file) for file in files])

[pd.read_csv(file) for file in files] is a list comprehension, which creates a dataframe of each file.
pd.concat combines all the files in the list

To get separate dataframes:

create a dict of dataframes
each key of the dict will be a filename

p = Path(r'c:\some_path_to_files')  # set your path
files = p.glob('nba*.csv')  # find your files

df_dict = dict()
for file in files:
    df_dict[file.stem] = pd.read_csv(file)

Using `df_dict` :

df_dict.keys()  # to show you all the keys

df_dict[filename]  # to access a specific dataframe

# after cleaning the individual dataframes in df_dict, they can be combined
df_final = pd.concat([value for value in df_dict.values()])

Answer 2

在Pandas上构建的Dask库具有将多个csv一次加载到单个数据帧的方法。

How to retrieve similarly named csv files and create dataframes with them

Question

2 answers

solution1
1 2019-09-11 16:53:59

To get a single, combined dataframe

To get separate dataframes:

Using `df_dict` :

solution2
0 2019-09-11 17:17:39

How to retrieve similarly named csv files and create dataframes with them

Question

2 answers

solution1 1 2019-09-11 16:53:59

To get a single, combined dataframe

To get separate dataframes:

Using df_dict :

solution2 0 2019-09-11 17:17:39

solution1
1 2019-09-11 16:53:59

Using `df_dict` :

solution2
0 2019-09-11 17:17:39