简体   繁体   中英

Exporting Pandas output for multiple CSV files

I have many CSV files under subdirectories in one folder. They all contain tweets and other metadata. I am interested in removing most of these metadata and keeping the tweets themselves and their time. I used glob to read the files, and the removing part seems to be working fine. However, I am not sure how to save the output so that all files are saved and with their original file name.

import pandas as pd
import glob
path = r'D:\tweets'
myfiles= glob.glob(r'D:\tweets\**\*.csv', recursive=True)
for f in myfiles:
    df = pd.read_csv(f)
df = df.drop(["name", "id","conversation_id","created_at","date"], axis=1)
df = df[df["language"].str.contains("bn|ca|ckbu|id||zh")==False]
df.to_csv("output_filename.csv", index=False, encoding='utf8')

If you do it this way, it will overwrite the same file:

for f in myfiles:
    df = pd.read_csv(f)
    df = df.drop(["name", "id","conversation_id","created_at","date"], axis=1)
    df = df[df["language"].str.contains("bn|ca|ckbu|id||zh")==False]
    df.to_csv(f, index=False, encoding='utf8')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM