[英]How to fix " AttributeError: 'int' object has no attribute 'replace' " while replacing and writing to a new file?
[英]How to fix AttributeError: 'int' object has no attribute 'strip' while loading excel file in pandas
我正在尝试在 pandas 中加载一个 excel 文件,但出现以下错误 -
AttributeError: 'int' object 没有属性 'strip'
为了理解,我从 excel 和 header 一起采取了以下一行 -
Row ID Order ID Order Date Ship Date Ship Mode Customer ID Customer Name Segment Country/Region City State Postal Code Region Product ID Category Sub-Category Product Name Sales Quantity Discount Profit
1 CA-2018-152156 08-11-2018 11-11-2018 Second Class CG-12520 Claire Gute Consumer United States Henderson Kentucky 42420 NorthEAST FUR-BO-10001798 Finance Bookcases Bush Somerset Collection Bookcase 261.96 2 0 41.9136
这是整个代码-
local_path= '../../data/RetailStore.xlsx'
out_path= '../../out/hyperstore.csv'
def load_retail_data(local_path,sheet_name):
return pd.read_excel(
local_path,
header=4,
sheet_name=sheet_name,
parse_dates=True
)
def clean_headers(data_frame:pd.DataFrame) -> pd.DataFrame:
data_frame=data_frame.rename(columns=lambda x:x.strip())
data_frame=data_frame.rename(columns=lambda x:x.replace('\n',' '))
data_frame=data_frame.rename(columns=lambda x:x.replace("'",' '))
data_frame=data_frame.rename(columns=lambda x:x.replace(' ',' '))
return data_frame
def filter_ship_mode(df):
return df[(df[ColumnsStore.ship_mode]!= 'Standard Class') & (df[ColumnsStore.ship_mode]!='Second Class')]
def calc_retail_data(local_path,sheet_name):
retail_data=load_retail_data(local_path,sheet_name)
retail_clean_headers=clean_headers(retail_data)
retail_filtered=filter_ship_mode(retail_clean_headers)
return retail_filtered
if __name__=="__main__":
df_retail_data=calc_retail_data(local_path,'Orders')
df_retail_data.to_csv(out_path,index=False)
您可以将列 header 类型转换为字符串
def clean_headers(data_frame:pd.DataFrame) -> pd.DataFrame:
df.columns = df.columns.astype(str)
data_frame=data_frame.rename(columns=lambda x:x.strip())
...
或者只是逃避 int 类型
def clean_headers(data_frame:pd.DataFrame) -> pd.DataFrame:
df.columns = df.columns.astype(str)
data_frame=data_frame.rename(columns=lambda x: x.strip() if isinstance(x, str) else x)
# do the same thing with other header cleaning function.
...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.