简体   繁体   中英

How to read a file name and append the name to a new column in a csv file using python pandas?

As the question says the requirement, I am using ubuntu os. I need to get the file name which I have in a folder called Sample_csv_files each represents a file in the same format except the id, eg

agent_op_023jlafa45459a390-.csv
agent_op_3rjfigr837yw749jh-.csv
agent_op_f78jlajk7h6559a39-.csv

Here I need to get those IDs and add them in a new_column. If I take for example the agent_op_023jlafa45459a390-.csv file, then I should populate the new_column with the id alone, eg

x  | y | new_column
abc|xyz| 023jlafa45459a390

for the entire CSV file. Similarly I need to do this for the rest of the files. Hope can understand the description above.

Anyone can help me to solve it out.

df1 = pd.read_csv('/home/user/Downloads/Sample_csv_files/agent_op_023jlafa45459a390-.csv')
df1['filename'] = "agent_op_023jlafa45459a390-.csv"
df1['filename'] = df1['filename'].map(lambda x: x.lstrip('agent-output').rstrip('-.csv'))
df2 = []
df3 = df1['filename'].append(df2)
print(df1.head(10))
df1.to_csv("/home/user/Downloads/sample_work.csv", index=False)

You can use glob.glob() to give you a list of all of the CSV files and then just extract the ID from each filename and add a new column. The file can then be updated as follows:

from glob import glob
import pandas as pd
import os.path

for filename in glob('my/source/folder/agent_op*.csv'):
    id = os.path.basename(filename).lstrip('agent_op_').rstrip('-.csv')
    df = pd.read_csv(filename)
    df['run_id'] = id
    df.to_csv(filename, index=False)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM