简体   繁体   中英

Saving footer (last few rows) of csv as separate file using Pandas in Python

I have a csv file that contains extra rows at the very end (last 9 rows) that are important, but do not fit the schema at all and need to be processed differently. They just contain number of clicks for different sites. I want to split these last few rows from the original csv and save it as a different file.

So far, I can get the most important rows out using pandas, skipping the footer. If the number of rows was consistent, then I could do the same for saving the footer using skiprows=0-2000 (for example), but these rows will change.

The code to save all the main rows is as follows:

reader = pd.read_csv(os.path.join(DATA_DIR, file), encoding='utf8', header=0, skipfooter=9, index_col=0)
trimmed_file_name = 'trimmed_{}'.format(file)
path = os.path.join(DATA_DIR)
full_path = path + "\ ".strip(' ') + trimmed_file_name 
     # had to use this odd way of creating a path because it kept trying to use \ as an escape char, just ignore
print(full_path)
reader.to_csv(full_path, mode='a')

So how do I just get out those last 9 rows without 'skiprows'? Any ideas? The footer is consistently the last 9 rows if that helps.

After reading in the first dataframe, we know how many regular rows there are. So just read the remaining by

footer = pd.read_csv(file, skiprows=len(reader))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM