[英](1)TypeError: cannot concatenate object of type '<class 'collections.OrderedDict'>'; only Series and DataFrame objs are valid (2)
I have created some simple code that copies and pastes all Excel files in directory in same folder with same formats and columns name....我创建了一些简单的代码,将所有 Excel 文件复制并粘贴到具有相同格式和列名的同一文件夹中的目录中......
The Excel file is a type of.xlsx as this file contains 3 sheets, so now I have three sheets called GSM, UMTS, and LTE and this sheet name is the same name in all sheets. Excel 文件是 .xlsx 类型,因为该文件包含 3 张工作表,所以现在我有 3 张工作表,分别称为 GSM、UMTS 和 LTE,并且此工作表名称在所有工作表中都相同。 Now all I need to copy the data in sheet GSM, data in UMTS, and data in LTE to it's every own data to the new sheet, and drop duplicates.....现在我需要将 GSM 表中的数据、UMTS 中的数据和 LTE 中的数据复制到新表中的每个自己的数据中,然后删除重复项.....
As I need also the change the color of the columns or keep it as the same style as like from source, and text style, etc...因为我还需要更改列的颜色或将其保持为与源代码和文本样式等相同的样式...
So Here's my code:所以这是我的代码:
import pandas as pd
import os
basepath = r'C:\Users\mwx825326\PycharmProjects\MyExcelCombine\myCDD Combine'
files = list(filter(lambda x: '.xlsx' in x, os.listdir(basepath)))
alldf = pd.DataFrame()
for f in files:
df= pd.read_excel(f"{basepath}/{f}",encoding='latin-1', sheet_name=None)
alldf = pd.concat([alldf,df]).drop_duplicates(keep=False)
alldf.to_excel("1- CDD Total12.xlsx")
and this is my error这是我的错误
Traceback (most recent call last):
File "C:/Users/mwx825326/PycharmProjects/MyExcelCombine/CombineTool.py", line 9, in <module>
alldf = pd.concat([alldf,df]).drop_duplicates(keep=False)
File "C:\Users\mwx825326\PycharmProjects\MyExcelCombine\venv\lib\site-packages\pandas\core\reshape\concat.py", line 255, in concat
sort=sort,
File "C:\Users\mwx825326\PycharmProjects\MyExcelCombine\venv\lib\site-packages\pandas\core\reshape\concat.py", line 332, in __init__
raise TypeError(msg)
TypeError: cannot concatenate object of type '<class 'collections.OrderedDict'>'; only Series and DataFrame objs are valid
Process finished with exit code 1
and this is my sheets looks like这是我的床单看起来像
mydir = (os.getcwd()).replace('\\', '/') + '/'
gsm_cdd_total = pd.read_excel(r'' + mydir + '1- CDD Total.xlsx' ,sheet_name='GSM')
umts_cdd_total = pd.read_excel(r'' + mydir + '1- CDD Total.xlsx' ,sheet_name='UMTS')
lte_cdd_total = pd.read_excel(r'' + mydir + '1- CDD Total.xlsx' ,sheet_name='LTE')
gsm_generate = pd.read_excel(r'' + mydir + 'GUL CDD20191008021501.xlsx' ,sheet_name='GSM')
umts_generate = pd.read_excel(r'' + mydir + 'GUL CDD20191008021501.xlsx' ,sheet_name='UMTS')
lte_generate = pd.read_excel(r'' + mydir + 'GUL CDD20191008021501.xlsx' ,sheet_name='LTE')
and this my excels xlsx
looks like it have three main sheets ever sheet have it's own data xlsx looks like而这个我的 excels xlsx
看起来像它有三个主工作表曾经工作表有它自己的数据xlsx 看起来像
So If any one knows how to update data relate to every sheet and how to solve this problem?那么如果有人知道如何更新与每张纸相关的数据以及如何解决这个问题?
When you run read_excel whith sheet_name=None , the result is a dictionary ( sheet_name : DataFrame ).当您使用sheet_name=None运行read_excel时,结果是一个字典( sheet_name : DataFrame )。
So:所以:
Something like:就像是:
for f in files:
# Here the result is a dictionary of DataFrames
dct = pd.read_excel(f"{basepath}/{f}",encoding='latin-1', sheet_name=None)
# Process each DataFrame from this dictionary
for df in dct.values()
alldf = pd.concat([alldf,df]).drop_duplicates(keep=False)
Another possibility: If each your Excel file has only a single sheet to read from, you can run your original code, but without sheet_name parameter (its default value is 0 , meaning read only from the first sheet and return a DataFrame ).另一种可能性:如果您的每个 Excel 文件只有一张要读取的工作表,您可以运行原始代码,但没有sheet_name参数(其默认值为0 ,表示仅从第一张工作表读取并返回 DataFrame )。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.