使用python将Concat Excel文件和工作表合二为一

Question

I have many excel files in a directory, all of them has the same header row. 我的目录中有许多Excel文件，它们都具有相同的标题行。 Some of these excel files has multiple worksheets which again have the same headers. 其中一些excel文件具有多个工作表，这些工作表又具有相同的标题。 I'm trying to loop through the excel files in the directory and for each one check if there are multiple worksheets to concat them as well as the rest of the excel files. 我试图遍历目录中的excel文件，并为每个检查是否有多个工作表来连接它们以及其余的excel文件。

This is what I tried: 这是我尝试的：

import pandas as pd
import os
import ntpath
import glob

dir_path = os.path.dirname(os.path.realpath(__file__))
os.chdir(dir_path)

for excel_names in glob.glob('*.xlsx'):
    # read them in
    i=0
    df = pd.read_excel(excel_names[i], sheet_name=None, ignore_index=True)
    cdf = pd.concat(df.values())
    cdf.to_excel("c.xlsx", header=False, index=False)
    excels = [pd.ExcelFile(name) for name in excel_names]

    # turn them into dataframes
    frames = [x.parse(x.sheet_names[0], header=None,index_col=None) for x in excels]

    # delete the first row for all frames except the first
    # i.e. remove the header row -- assumes it's the first
    frames[1:] = [df[1:] for df in frames[1:]]

    # concatenate them..
    combined = pd.concat(frames)

    # write it out
    combined.to_excel("c.xlsx", header=False, index=False)
    i+=1

but then I get the below error any advice? 但是然后我得到以下错误任何建议吗？

"concat excel.py", line 12, in <module>
    df = pd.read_excel(excel_names[i], sheet_name=None, ignore_index=True)
  File "/usr/local/lib/python2.7/site-packages/pandas/util/_decorators.py", line 188, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/pandas/util/_decorators.py", line 188, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/pandas/io/excel.py", line 350, in read_excel
    io = ExcelFile(io, engine=engine)
  File "/usr/local/lib/python2.7/site-packages/pandas/io/excel.py", line 653, in __init__
    self._reader = self._engines[engine](self._io)
  File "/usr/local/lib/python2.7/site-packages/pandas/io/excel.py", line 424, in __init__
    self.book = xlrd.open_workbook(filepath_or_buffer)
  File "/usr/local/lib/python2.7/site-packages/xlrd/__init__.py", line 111, in open_workbook
    with open(filename, "rb") as f:
IOError: [Errno 2] No such file or directory: 'G'

Answer 1

Your for statement is setting excel_names to each filename in turn (so a better variable name would be excel_name ): 您的for语句依次将excel_names设置为每个文件名（因此，更好的变量名为excel_name ）：

for excel_names in glob.glob('*.xlsx'):

But inside the loop your code does 但是在循环内您的代码确实

df = pd.read_excel(excel_names[i], sheet_name=None, ignore_index=True)

where you are clearly expecting excel_names to be a list from which you are extracting one element. 您显然希望excel_names是从中提取一个元素的列表。 But it isn't a list, it's a string. 但这不是一个列表，而是一个字符串。 So you are getting the first character of the first filename. 因此，您将获得第一个文件名的第一个字符。

使用python将Concat Excel文件和工作表合二为一

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-03-13 13:51:51

使用python将Concat Excel文件和工作表合二为一

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-03-13 13:51:51

解决方案1
1 已采纳 2019-03-13 13:51:51