简体   繁体   English

在具有路径引用的数百个 excel 文件上添加新列到一个 pandas dataframe

[英]Adding new columns on hundreds excel file with path reference into one pandas dataframe

I have probably hundreds or thousands small excel file with bracket into one pandas dataframe我可能有成百上千个带括号的小 excel 文件 pandas dataframe

Before I merge them, I need to give flag for which category they come from在我合并它们之前,我需要给出它们来自哪个类别的标志

Here's my table of reference df这是我的参考表df

    Dataframe_name      Path                                 Sheet
45  finance_auditing    Finance - Accounting/TopSites-Fin... Aggregated_Data_for_Time_Period
46  finance_lending     Finance - Banking/TopSites-...          Aggregated_Data_for_Time_Period

What I did Dataframe_name name column is filled manually, but what I expected is using refference table我所做Dataframe_name名称列是手动填充的,但我期望的是使用引用表

finance_auditing  = pd.read_excel('Finance - Accounting/TopSites-Fin... ','Aggregated_Data_for_Time_Period')
finance_lending   = pd.read_excel('Finance - Banking/TopSites-... ','Aggregated_Data_for_Time_Period')
finance_auditing['Dataframe_name'] = 'finance_auditing'
finance_lending['Dataframe_name'] = 'finance_lending'
dF_all = pd.concat([pd.read_excel(path, sheet_name=sheet) 
           for path, sheet in zip(df.Path, df.Sheet)])

The problem is I have hundreds of of file to read and need to append them all问题是我有数百个文件要读取,需要全部读取 append

This would be fairly simply, you can assign the flag dynamically for each iteration:这将相当简单,您可以为每次迭代动态assign标志:

pd.concat([pd.read_excel(path, sheet_name=sheet).assign(df_name=name)
                             for name, path, sheet in df.to_numpy()])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM