繁体   English   中英

如何使用python将多个Excel文件中的数据合并到一个Excel文件中?

[英]How can I use python to combine data from multiple excel files into one excel file?

到目前为止,这是我的代码:

import glob
import pandas as pd
import numpy as np
import openpyxl

log = 'G:\Data\Hotels\hotel.txt'  #text file with my long list of hotels 
file = open(log, 'r')
hotels = []
line = file.readlines()
for a in line:
    hotels.append(a.rstrip('\n'))


for hotel in hotels :
    path = "G:\\Data\\Hotels\\"+hotel+"\\"+hotel+" - Meetings"
    file = hotel+"_Action_Log.xlsx" 
    print(file)

因此,到目前为止,所有这些代码都完成了打印所有酒店文件的名称(我猜是字符串吗?),现在我要将这些文件复制并粘贴到一个“ Master” excel文件中。 在每个excel文件中我只需要一张纸,并且不需要标头(由于前4行的格式复杂,它们放置在第5行中)。

我接下来的步骤是什么? 我是python的新手。

根据您对问题的描述,我假设您是要打开并附加多个具有相同格式和结构的文件(即,具有相同的列,并且列的顺序相同)。

换句话说,您想要执行以下操作:

Excel工作表1

Col1 Col2
a    b

Excel工作表2

Col1 Col2
c    d

合并(附加)Excel工作表

Col1 Col2
a    b
c    d

如果我对您的问题的假设是正确的,那么您可以尝试以下方法:

import glob
import pandas as pd
import numpy as np
import openpyxl

# This is your code
log = 'G:\Data\Hotels\hotel.txt'  #text file with my long list of hotels 
file = open(log, 'r')
hotels = []
line = file.readlines()
for a in line:
    hotels.append(a.rstrip('\n'))

# We'll use this list to keep track of all your filepaths
filepaths = []

# I merged your 'path' and 'file' vars into a single variable ('fp')
for hotel in hotels :
    # path = "G:\\Data\\Hotels\\"+hotel+"\\"+hotel+" - Meetings"
    # file = hotel+"_Action_Log.xlsx"
    fp = "G:\\Data\\Hotels\\"+hotel+"\\"+hotel+" -Meetings\\"+hotel+"_Action_Log.xlsx"
    # print(file)
    filepaths.append(fp)

# This list stores all of your worksheets (as dataframes)
worksheets = []

# Open all of your Excel worksheets as Pandas dataframes and store them in 'worksheets' to concatenate later
for filepath in filepaths:
    # You may need to adjust the `skiprows` parameter; right now it's set to skip (not read) the first row of each Excel worksheet (typically the header row)
    df = pd.read_excel(filepath, skiprows=1)
    worksheets.append(df)

# Append all worksheets together
append = pd.concat(worksheets)

# Change 'header' to True if you want to write out column headers
append.to_excel('G:\\Data\\Hotels\\merged.xlsx', header=False)

您可以在此处了解有关pd.concat()方法的更多信息: https : pd.concat()

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM