[英]Concatenating multiple .xls files with unknown number of columns
Using Python, I want to merge all .xls files in a directory into one data frame and save that as a new concatenated .xls file. 我想使用Python将目录中的所有.xls文件合并到一个数据框中,并将其另存为新的级联.xls文件。 The .xls files will have an unknown number of columns and no consistent headers. .xls文件将具有未知数量的列,并且没有一致的标题。
I've used other suggestions on this forum and ended up with this: 我在该论坛上使用了其他建议,最终得到了以下结果:
import os
import pandas as pd
path = os.getcwd()
files = os.listdir(path)
files_xls = [f for f in files if f[-3:] == 'xls']
df = pd.DataFrame()
for f in files_xls:
data = pd.read_excel(f for f in files_xls) # I dont understand what to add
# in the parentheses here.
df = df.append(data)
df
I'm getting these errors: 我收到这些错误:
File "<ipython-input-17-bb67a423cf40>", line 14, in <module>
data = pd.read_excel(f for f in files_xls)
File "C:\Users\xxxx\Anaconda2\lib\site-packages\pandas\io\excel.py", line 170, in read_excel
io = ExcelFile(io, engine=engine)
File "C:\Users\xxxx\Anaconda2\lib\site-packages\pandas\io\excel.py", line 229, in __init__
raise ValueError('Must explicitly set engine if not passing in'
ValueError: Must explicitly set engine if not passing in buffer or path for io.
try this brother 试试这个兄弟
df = []
for f in files_xls:
data = pd.read_excel(f)
df = df.append(data)
mydf = pd.concat(df, axis = 0)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.