连接多个具有未知列数的.xls文件

Question

Using Python, I want to merge all .xls files in a directory into one data frame and save that as a new concatenated .xls file. 我想使用Python将目录中的所有.xls文件合并到一个数据框中，并将其另存为新的级联.xls文件。 The .xls files will have an unknown number of columns and no consistent headers. .xls文件将具有未知数量的列，并且没有一致的标题。

I've used other suggestions on this forum and ended up with this: 我在该论坛上使用了其他建议，最终得到了以下结果：

import os
import pandas as pd

path = os.getcwd()
files = os.listdir(path)

files_xls = [f for f in files if f[-3:] == 'xls']

df = pd.DataFrame()

for f in files_xls:
    data = pd.read_excel(f for f in files_xls) # I dont understand what to add 
# in the parentheses here.
    df = df.append(data)
    df

I'm getting these errors: 我收到这些错误：

File "<ipython-input-17-bb67a423cf40>", line 14, in <module>
  data = pd.read_excel(f for f in files_xls)

File "C:\Users\xxxx\Anaconda2\lib\site-packages\pandas\io\excel.py", line 170, in read_excel
  io = ExcelFile(io, engine=engine)

File "C:\Users\xxxx\Anaconda2\lib\site-packages\pandas\io\excel.py", line 229, in __init__
  raise ValueError('Must explicitly set engine if not passing in'

ValueError: Must explicitly set engine if not passing in buffer or path for io.

Answer 1

try this brother 试试这个兄弟

df = []

for f in files_xls:
    data = pd.read_excel(f) 
    df = df.append(data)

mydf = pd.concat(df, axis = 0)

连接多个具有未知列数的.xls文件

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-01-05 16:58:11

连接多个具有未知列数的.xls文件

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-01-05 16:58:11

解决方案1
1 已采纳 2017-01-05 16:58:11