将列添加到多个.csv 文件和文件名，因为您将这些.csv 文件组合成单个 dataframe

Question

I have 50.csv files with over 188k rows combined that I would need to add the file name to so that I am able to track which file it came from.我有 50.csv 文件组合超过 188k 行，我需要添加文件名，以便能够跟踪它来自哪个文件。 I have included the code I am using below which is able to combine the files into a single df.我在下面包含了我正在使用的代码，它能够将文件组合成一个 df。

df = pd.DataFrame()
for file in files:
    if file.endswith('.csv'):
        df=df.append(pd.read_csv(file), ignore_index=True)
df.head()

Answer 1

You're almost there.您快到了。 Instead of appending directly the result of the read_csv() , store it and add a new column with the file name不要直接附加read_csv()的结果，而是存储它并添加一个带有文件名的新列

for file in files:
    if file.endswith('.csv'):
        df_new = pd.read_csv(file)
        df_new['from_file'] = file
        df = df.append(df_new, ignore_index=True)

Also if your file variable is actually the whole path to the file, you can use os.path.basename(file) which return the name of the file only, without the path.此外，如果您的file变量实际上是文件的整个路径，则可以使用os.path.basename(file)仅返回文件名，而不返回路径。

将列添加到多个.csv 文件和文件名，因为您将这些.csv 文件组合成单个 dataframe

问题描述

1 个解决方案

解决方案1
0 2022-01-27 23:48:01

将列添加到多个.csv 文件和文件名，因为您将这些.csv 文件组合成单个 dataframe

问题描述

1 个解决方案

解决方案1 0 2022-01-27 23:48:01

解决方案1
0 2022-01-27 23:48:01