pd.read_excel 字典中的問題操作數據幀

Question

大家好，我有一個 excel 和多個工作表，我通過 pd.read_excel 加載到字典 object 中。每個 excel 工作表包含一些相同的列名稱：

比如 sheet1: ['IDx','IDy'], sheet2: ['IDx','IDy']

我通常使用 pd.concat 將所有內容加載到一個數據框中。 這樣，相同的列名也可以正確合並。

但是這種方式在其中一個 ID 列中有一些意外的額外空白但實際上是要與 pd.concat 合並的情況下並不可靠。

像： sheet1: ['IDx','IDy'], sheet2: ['IDx','IDy']

為了解決這個問題，我想遍歷 excel dict 和 strip() 空白列名的每個數據幀，然后再遍歷 pd.concat。

excel = pd.read_excel('D:\ExcelFile.xlsx', sheet_name=None)    

new_excel = {}

for name, sheet in excel.items():
    sheet = sheet.columns.str.strip()
    #print(sheet)
    new_excel[name] = sheet
    
print(new_excel)

output：

{'Sheet1': Index([      'IDx',         'IDy', ..... ]}

在這一點上我被困住了。 我不能用 new_excel 字典做任何事情。 似乎我錯誤地訪問了每個數據框，只得到了索引 object。我無法解決這個問題。

嘗試與 new_excel 連接時：

TypeError: cannot concatenate object of type '<class 'pandas.core.indexes.base.Index'>'; only Series and DataFrame objs are valid

提前謝謝了！

Answer 1

這有用嗎？

new_excel = {}

for name, sheet in excel.items():
    new_columns = [str(col).strip() for col in sheet.columns]
    sheet.columns = new_columns
    new_excel[name] = sheet

Answer 2

不，遺憾的是沒有。 我的 col 名稱中有整數（屬性錯誤：“int”object 沒有屬性“strip”）。 當嘗試更改列的類型時，我遇到了鍵錯誤和索引被弄亂的麻煩。 我試圖用另一個 for 循環來規避這個問題，但到目前為止還沒有成功。 我不太擅長這個

for name, sheet in excel.items():
    for i in sheet: 
        if type(i) != int: 
            new_columns = [col.strip() for col in sheet.columns]
      
        else:  
            new_columns = sheet.columns
    print(new_columns)    
    sheet.columns = new_columns
    new_excel[name] = sheet

仍然得到AttributeError: 'int' object has no attribute 'strip'

錯誤很明顯，但不是我的解決方案

pd.read_excel 字典中的問題操作數據幀

問題描述

2 個解決方案

解決方案1
0 已采納 2021-01-28 11:33:28

解決方案2
0 2021-01-28 19:40:34

pd.read_excel 字典中的問題操作數據幀

問題描述

2 個解決方案

解決方案1 0 已采納 2021-01-28 11:33:28

解決方案2 0 2021-01-28 19:40:34

解決方案1
0 已采納 2021-01-28 11:33:28

解決方案2
0 2021-01-28 19:40:34