How to read a file name and append the name to a new column in a csv file using python pandas?

Question

As the question says the requirement, I am using ubuntu os. I need to get the file name which I have in a folder called Sample_csv_files each represents a file in the same format except the id, eg

agent_op_023jlafa45459a390-.csv
agent_op_3rjfigr837yw749jh-.csv
agent_op_f78jlajk7h6559a39-.csv

Here I need to get those IDs and add them in a new_column. If I take for example the agent_op_023jlafa45459a390-.csv file, then I should populate the new_column with the id alone, eg

x  | y | new_column
abc|xyz| 023jlafa45459a390

for the entire CSV file. Similarly I need to do this for the rest of the files. Hope can understand the description above.

Anyone can help me to solve it out.

df1 = pd.read_csv('/home/user/Downloads/Sample_csv_files/agent_op_023jlafa45459a390-.csv')
df1['filename'] = "agent_op_023jlafa45459a390-.csv"
df1['filename'] = df1['filename'].map(lambda x: x.lstrip('agent-output').rstrip('-.csv'))
df2 = []
df3 = df1['filename'].append(df2)
print(df1.head(10))
df1.to_csv("/home/user/Downloads/sample_work.csv", index=False)

Answer 1

You can use glob.glob() to give you a list of all of the CSV files and then just extract the ID from each filename and add a new column. The file can then be updated as follows:

from glob import glob
import pandas as pd
import os.path

for filename in glob('my/source/folder/agent_op*.csv'):
    id = os.path.basename(filename).lstrip('agent_op_').rstrip('-.csv')
    df = pd.read_csv(filename)
    df['run_id'] = id
    df.to_csv(filename, index=False)

How to read a file name and append the name to a new column in a csv file using python pandas?

Question

1 answers

solution1
1 ACCPTED 2021-02-02 17:43:23

How to read a file name and append the name to a new column in a csv file using python pandas?

Question

1 answers

solution1 1 ACCPTED 2021-02-02 17:43:23

solution1
1 ACCPTED 2021-02-02 17:43:23