如何附加多个 CSV 文件并在 Python 中添加指示文件名的附加列？

Question

I have over 20 CSV files in a single folder.我在一个文件夹中有 20 多个 CSV 文件。 All files have the same structure, they just represent different days.所有文件都具有相同的结构，它们只是代表不同的日子。

Example:例子：

Day01.csv Day01.csv

Day02.csv Day02.csv

Day03.csv Day03.csv

Day04.csv (and so on...) Day04.csv（等等...）

The files contain just two numeric columns: x and y.这些文件只包含两个数字列：x 和 y。 I would like to append all of these csv files together into one large file and add a column for the file name (day).我想将所有这些 csv 文件一起附加到一个大文件中，并为文件名（天）添加一列。 I have explored similar examples to generate the following code but this code adds each y to a separate column (Y1, Y2, Y3, Y4...and so on).我探索了类似的示例来生成以下代码，但此代码将每个 y 添加到单独的列（Y1、Y2、Y3、Y4...等）。 I would like to simply have this appended file as three columns: x, y, file name.我只想将此附加文件作为三列：x，y，文件名。 How can I modify the code to do the proper append?如何修改代码以进行正确的追加？

I have tried the code from this example: Read multiple csv files and Add filename as new column in pandas我已经尝试过这个例子中的代码： Read multiple csv files and Add filename as new column in pandas

import pandas as pd
import os
os.chdir('C:....path to my folder')
files = os.listdir()
df = pd.concat([pd.read_csv(fp).assign(New=os.path.basename(fp)) for fp in files])

However, this code does not append all Y values under one column.但是，此代码不会将所有 Y 值附加到一列下。 (all other aspects seem to work, however). （然而，所有其他方面似乎都有效）。 Can someone help with the code so that all Y values are under a single column?有人可以帮助编写代码，以便所有 Y 值都在一个列下吗？

Answer 1

The following should work by creating the filename column before appending the dataframe to your list.以下应该通过在将dataframe附加到列表之前创建filename名列来工作。

import os
import pandas as pd

file_list = []
for file in os.listdir():
    if file.endswith('.csv'):
        df = pd.read_csv(file,sep=";")
        df['filename'] = file
        file_list.append(df)

all_days = pd.concat(file_list, ignore_index=True)
all_days.to_csv("all.txt")

Answer 2

python is great at these simple task, almost too good to be true… python 擅长这些简单的任务，几乎好得令人难以置信……

fake_files = lambda n: '\n'.join(('%d\t%d'%(i, i+1) for i in range(n, n+3)))

file_name = 'fake_me%s.csv'

with open('my_new.csv', 'wt') as new:
    for number in range(3): # os.listdir()
#        with open(number) as to_add:
#            rows = to_add.readlines()
            rows_fake = fake_files(number*2).split('\n')
            adjusted_rows = [file_name%number + '\t' + row for row in rows_fake]
            new.write('\n'.join(adjusted_rows) + '\n')

with adjustments to your specific io and naming, this is all you need.调整您的特定 io 和命名，这就是您所需要的。 you can just copy the code and run it and study how it works.你可以复制代码并运行它并研究它是如何工作的。

如何附加多个 CSV 文件并在 Python 中添加指示文件名的附加列？

问题描述

2 个解决方案

解决方案1
6 已采纳 2019-01-13 21:40:57

解决方案2
0 2019-01-13 21:14:46

如何附加多个 CSV 文件并在 Python 中添加指示文件名的附加列？

问题描述

2 个解决方案

解决方案1 6 已采纳 2019-01-13 21:40:57

解决方案2 0 2019-01-13 21:14:46

解决方案1
6 已采纳 2019-01-13 21:40:57

解决方案2
0 2019-01-13 21:14:46