简体   繁体   中英

Pandas falsely converting strings to floats

I'm using a csv file from Excel to create a pandas data frame. Recently, I've encountered several ValueError messages regarding the dtypes of each column in the dataframe.

This is the most recent exception raised:

ValueError: could not convert string to float: 'OH'

After running pandas' dtypes method on my data frame, it shows that this particular column addr_state is an object, not a float.

I've pasted all my code below for clarification:

work_path = 'C:\\Users\\Projects\\loans.csv'
unfiltered_y_df = pd.read_csv(work_path, low_memory=False, encoding='latin-1')
print(unfiltered_y_df.dtypes)
filtered_y_df = unfiltered_y_df.loc[unfiltered_y_df['loan_status'].isin(['Fully Paid', 'Charged Off', 'Default'])]

X = StandardScaler().fit_transform(filtered_y_df[[column for column in filtered_y_df]])
Y = filtered_y_df['loan_status']

Also, is it possible to explicitly write out the dtypes for each column? Right now I feel like that's the only way to solve this. Thanks in advance!

So two issues here I think:

  1. To print out the types for each column just use the ftypes or dtypes method:

    ie unfiltered_y_df.ftypes

  2. You say 'addr_state' is an object not a float. Well that is the problem, StandardScaler() will only work on floats so it is trying to coerce your state 'OH' to a float and can't, hence the error

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM