I am working on creating a data frame for the data I have for each year (1971-2017). I have a for loop that creates the data frame, but it is all in one. How would I make it so that it creates a separate df for each year? Below is what I currently have.
for years in range(1971,2017):
df = pd.read_csv('gene_%4.4d.txt'%years, sep='|', header=None, names=['PubMed ID','Title','Abstract','Affiliations','Pub Year','Pub Month','Pub Day','Journal'])
You are overwriting the df
variable each time you read in a new file. To avoid this, I'd suggest initializing a list outside of the loop, and storing each new DataFrame in it:
all_dfs = []
for years in range(1971, 2017):
df = pd.read_csv('gene_%4.4d.txt' % years, sep='|', header=None, names=['PubMed ID', 'Title', 'Abstract', 'Affiliations', 'Pub Year', 'Pub Month', 'Pub Day', 'Journal'])
all_dfs.append(df)
Now all_dfs
is a list of all the DataFrames. (A common thing to do next is to combine them all into one large DataFrame, eg pd.concat(all_dfs)
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.