简体   繁体   中英

Creating differently named data frames in a for loop - Python?

I am working on creating a data frame for the data I have for each year (1971-2017). I have a for loop that creates the data frame, but it is all in one. How would I make it so that it creates a separate df for each year? Below is what I currently have.

for years in range(1971,2017):
        df = pd.read_csv('gene_%4.4d.txt'%years, sep='|', header=None, names=['PubMed ID','Title','Abstract','Affiliations','Pub Year','Pub Month','Pub Day','Journal'])

You are overwriting the df variable each time you read in a new file. To avoid this, I'd suggest initializing a list outside of the loop, and storing each new DataFrame in it:

all_dfs = []

for years in range(1971, 2017):
    df = pd.read_csv('gene_%4.4d.txt' % years, sep='|', header=None, names=['PubMed ID', 'Title', 'Abstract', 'Affiliations', 'Pub Year', 'Pub Month', 'Pub Day', 'Journal'])
    all_dfs.append(df)

Now all_dfs is a list of all the DataFrames. (A common thing to do next is to combine them all into one large DataFrame, eg pd.concat(all_dfs) )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM