使用Python和Pandas合并多个CSV文件

Question

I have the following code: 我有以下代码：

import glob
import pandas as pd
allFiles = glob.glob("C:\*.csv")
frame = pd.DataFrame()
list_ = []
for file_ in allFiles:
    print file_
    df = pd.read_csv(file_,index_col=None, header=0)
    list_.append(df)
    frame = pd.concat(list_, sort=False)
print list_
frame.to_csv("C:\f.csv")

This combines multiple CSVs to single CSV. 它将多个CSV合并为单个CSV。

However it also adds a row number column. 但是，它还会添加一个行号列。

Input: 输入：

a.csv CSV

a   b   c   d
1   2   3   4

b.csv b.csv

a   b   c   d
551 55  55  55
551 55  55  55

result: f.csv 结果：f.csv

    a   b   c   d
0   1   2   3   4
0   551 55  55  55
1   551 55  55  55

How can I modify the code not to show the row numbers in the output file? 如何修改代码以不在输出文件中显示行号？

Answer 1

Change frame.to_csv("C:\\f.csv") to frame.to_csv("C:\\f.csv", index=False) 将frame.to_csv("C:\\f.csv")更改为frame.to_csv("C:\\f.csv", index=False)

See: pandas.DataFrame.to_csv 请参阅： pandas.DataFrame.to_csv

Answer 2

You don't have to use pandas for this simple task. 您不必使用熊猫来完成此简单任务。 pandas is parsing the file and converting the data to numpy constructs, which you don't need... In fact you can do it with just normal text file manipulation: pandas正在解析文件并将数据转换为numpy构造，您不需要...实际上，您可以通过普通的文本文件操作来做到这一点：

import glob
allFiles = glob.glob("C:\*.csv")
first = True
with open('C:\f.csv', 'w') as fw:
    for filename in allFiles:
        print filename
        with open(filename, 'r') as f:
            if not first:
                f.readline() # skip header
            first = False
            fw.writelines(f)

使用Python和Pandas合并多个CSV文件

问题描述

2 个解决方案

解决方案1
2 2018-06-27 13:38:12

解决方案2
1 已采纳 2018-06-27 13:41:13

使用Python和Pandas合并多个CSV文件

问题描述

2 个解决方案

解决方案1 2 2018-06-27 13:38:12

解决方案2 1 已采纳 2018-06-27 13:41:13

解决方案1
2 2018-06-27 13:38:12

解决方案2
1 已采纳 2018-06-27 13:41:13