简体   繁体   中英

DataFrame interrows() and .to_csv: Writing row by row

I'm using following script to

  • Apply a function to a column in each row of a DataFrame
  • Write the returns from that function into two new columns of a DataFrame
  • Continuously write the DataFrame into a *.csv

I like to learn whether there's a better way to run the following computation:

df = a DataFrame with 500 rows, 20 columns

for index, row in df.iterrows():
    df.loc[index, 'words'], df.loc[index, 'count'] = transcribe(df.loc[index, 'text'])
    df.to_csv('out.csv', encoding='utf-8', index=False)

Currently, the script each time (for each row) outputs the full df dataframe as *.csv, including the added values for the computed rows "words" and "counts" until then. I like to know, whether it would also be possible to just write line by line complete, ie to only output those lines in the csv that are complete.

Thanks!

I can't understand why you want to do it row by row instead of writing the whole dataframe at the end, but here is a solution for your question: write slices of the dataframe (ie the current row) in append mode, adding the header for the first row only:

is_first_row = True
for index, row in df.iterrows():
    df.loc[index, 'words'], df.loc[index, 'count'] = transcribe(df.loc[index, 'text'])
    df.loc[index:index].to_csv('out.csv', encoding='utf-8', index=False, mode='a', header=is_first_row)
    is_first_row = False


Update based on comment that script could be interrupted:
In this case you may want to determine whether or not to write the header by checking if the file already exists or is new:

with open('out.csv', encoding='utf-8', mode='a') as f:
    for index, row in df.iterrows():
        df.loc[index, 'words'], df.loc[index, 'count'] = transcribe(df.loc[index, 'text'])
        df.loc[index:index].to_csv(f, index=False, header=f.tell()==0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM