简体   繁体   English

连接多个具有未知列数的.xls文件

[英]Concatenating multiple .xls files with unknown number of columns

Using Python, I want to merge all .xls files in a directory into one data frame and save that as a new concatenated .xls file. 我想使用Python将目录中的所有.xls文件合并到一个数据框中,并将其另存为新的级联.xls文件。 The .xls files will have an unknown number of columns and no consistent headers. .xls文件将具有未知数量的列,并且没有一致的标题。

I've used other suggestions on this forum and ended up with this: 我在该论坛上使用了其他建议,最终得到了以下结果:

import os
import pandas as pd

path = os.getcwd()
files = os.listdir(path)

files_xls = [f for f in files if f[-3:] == 'xls']

df = pd.DataFrame()

for f in files_xls:
    data = pd.read_excel(f for f in files_xls) # I dont understand what to add 
# in the parentheses here.
    df = df.append(data)
    df

I'm getting these errors: 我收到这些错误:

File "<ipython-input-17-bb67a423cf40>", line 14, in <module>
  data = pd.read_excel(f for f in files_xls)

File "C:\Users\xxxx\Anaconda2\lib\site-packages\pandas\io\excel.py", line 170, in read_excel
  io = ExcelFile(io, engine=engine)

File "C:\Users\xxxx\Anaconda2\lib\site-packages\pandas\io\excel.py", line 229, in __init__
  raise ValueError('Must explicitly set engine if not passing in'

ValueError: Must explicitly set engine if not passing in buffer or path for io.

try this brother 试试这个兄弟

df = []

for f in files_xls:
    data = pd.read_excel(f) 
    df = df.append(data)

mydf = pd.concat(df, axis = 0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM