繁体   English   中英

如何修复 AttributeError: 'int' object has no attribute 'strip' while loading excel file in pandas

[英]How to fix AttributeError: 'int' object has no attribute 'strip' while loading excel file in pandas

我正在尝试在 pandas 中加载一个 excel 文件,但出现以下错误 -

AttributeError: 'int' object 没有属性 'strip'

为了理解,我从 excel 和 header 一起采取了以下一行 -

 Row ID Order ID    Order Date  Ship Date   Ship Mode   Customer ID Customer Name   Segment Country/Region  City    State   Postal Code Region  Product ID  Category    Sub-Category    Product Name    Sales   Quantity    Discount    Profit

    1   CA-2018-152156  08-11-2018  11-11-2018  Second Class    CG-12520    Claire Gute Consumer    United States   Henderson   Kentucky    42420   NorthEAST   FUR-BO-10001798 Finance Bookcases   Bush Somerset Collection Bookcase   261.96  2   0   41.9136

这是整个代码-

local_path= '../../data/RetailStore.xlsx'
out_path= '../../out/hyperstore.csv'

def load_retail_data(local_path,sheet_name):
    return pd.read_excel(
        local_path,
        header=4,
        sheet_name=sheet_name,
        parse_dates=True
    )

def clean_headers(data_frame:pd.DataFrame) -> pd.DataFrame:
    data_frame=data_frame.rename(columns=lambda x:x.strip())
    data_frame=data_frame.rename(columns=lambda x:x.replace('\n',' '))
    data_frame=data_frame.rename(columns=lambda x:x.replace("'",' '))
    data_frame=data_frame.rename(columns=lambda x:x.replace('  ',' '))
    return data_frame

def filter_ship_mode(df):
    return df[(df[ColumnsStore.ship_mode]!= 'Standard Class') & (df[ColumnsStore.ship_mode]!='Second Class')]


def calc_retail_data(local_path,sheet_name):
    retail_data=load_retail_data(local_path,sheet_name)
    retail_clean_headers=clean_headers(retail_data)
    retail_filtered=filter_ship_mode(retail_clean_headers)
    return retail_filtered


if __name__=="__main__":
    df_retail_data=calc_retail_data(local_path,'Orders')
    df_retail_data.to_csv(out_path,index=False)

您可以将列 header 类型转换为字符串

def clean_headers(data_frame:pd.DataFrame) -> pd.DataFrame:
    df.columns = df.columns.astype(str)
    data_frame=data_frame.rename(columns=lambda x:x.strip())
    ...

或者只是逃避 int 类型

def clean_headers(data_frame:pd.DataFrame) -> pd.DataFrame:
    df.columns = df.columns.astype(str)
    data_frame=data_frame.rename(columns=lambda x: x.strip() if isinstance(x, str) else x)
    # do the same thing with other header cleaning function.
    ...

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM