[英]How to read a file name and append the name to a new column in a csv file using python pandas?
As the question says the requirement, I am using ubuntu os.正如问题所说的要求,我正在使用 ubuntu 操作系统。 I need to get the file name which I have in a folder called Sample_csv_files each represents a file in the same format except the id, eg
我需要获取我在名为Sample_csv_files的文件夹中的文件名,每个文件都代表一个格式相同的文件,除了 id,例如
agent_op_023jlafa45459a390-.csv
agent_op_3rjfigr837yw749jh-.csv
agent_op_f78jlajk7h6559a39-.csv
Here I need to get those IDs and add them in a new_column.在这里,我需要获取这些 ID 并将它们添加到 new_column 中。 If I take for example the agent_op_023jlafa45459a390-.csv file, then I should populate the new_column with the id alone, eg
如果我以agent_op_023jlafa45459a390-.csv文件为例,那么我应该只用 id 填充 new_column,例如
x | y | new_column
abc|xyz| 023jlafa45459a390
for the entire CSV file.对于整个 CSV 文件。 Similarly I need to do this for the rest of the files.
同样,我需要对文件的 rest 执行此操作。 Hope can understand the description above.
希望能看懂上面的描述。
Anyone can help me to solve it out.任何人都可以帮我解决它。
df1 = pd.read_csv('/home/user/Downloads/Sample_csv_files/agent_op_023jlafa45459a390-.csv')
df1['filename'] = "agent_op_023jlafa45459a390-.csv"
df1['filename'] = df1['filename'].map(lambda x: x.lstrip('agent-output').rstrip('-.csv'))
df2 = []
df3 = df1['filename'].append(df2)
print(df1.head(10))
df1.to_csv("/home/user/Downloads/sample_work.csv", index=False)
You can use glob.glob()
to give you a list of all of the CSV files and then just extract the ID from each filename and add a new column.您可以使用
glob.glob()
为您提供所有 CSV 文件的列表,然后从每个文件名中提取 ID 并添加一个新列。 The file can then be updated as follows:然后可以按如下方式更新该文件:
from glob import glob
import pandas as pd
import os.path
for filename in glob('my/source/folder/agent_op*.csv'):
id = os.path.basename(filename).lstrip('agent_op_').rstrip('-.csv')
df = pd.read_csv(filename)
df['run_id'] = id
df.to_csv(filename, index=False)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.