熊猫错误地将字符串转换为浮点数

Question

I'm using a csv file from Excel to create a pandas data frame. 我正在使用Excel中的csv文件创建熊猫数据框。 Recently, I've encountered several ValueError messages regarding the dtypes of each column in the dataframe. 最近，我遇到了一些有关数据帧中每一列的dtypes的ValueError消息。

This is the most recent exception raised: 这是最近引发的异常：

ValueError: could not convert string to float: 'OH' ValueError：无法将字符串转换为float：'OH'

After running pandas' dtypes method on my data frame, it shows that this particular column addr_state is an object, not a float. 在我的数据帧上运行pandas的dtypes方法后，它表明该特定列addr_state是一个对象，而不是浮点数。

I've pasted all my code below for clarification: 为了清楚起见，我在下面粘贴了所有代码：

work_path = 'C:\\Users\\Projects\\loans.csv'
unfiltered_y_df = pd.read_csv(work_path, low_memory=False, encoding='latin-1')
print(unfiltered_y_df.dtypes)
filtered_y_df = unfiltered_y_df.loc[unfiltered_y_df['loan_status'].isin(['Fully Paid', 'Charged Off', 'Default'])]

X = StandardScaler().fit_transform(filtered_y_df[[column for column in filtered_y_df]])
Y = filtered_y_df['loan_status']

Also, is it possible to explicitly write out the dtypes for each column? 另外，是否可以为每列明确写出dtypes？ Right now I feel like that's the only way to solve this. 现在，我觉得这是解决此问题的唯一方法。 Thanks in advance! 提前致谢！

Answer 1

So two issues here I think: 我认为这里有两个问题：

To print out the types for each column just use the ftypes or dtypes method: 要输出每列的类型，只需使用ftypes或dtypes方法：
ie unfiltered_y_df.ftypes 即unfiltered_y_df.ftypes
You say 'addr_state' is an object not a float. 您说“ addr_state”是一个对象而不是float。 Well that is the problem, StandardScaler() will only work on floats so it is trying to coerce your state 'OH' to a float and can't, hence the error 嗯，这就是问题所在，StandardScaler（）仅适用于浮点数，因此它正试图将您的状态“ OH”强制为浮点数，并且不能，因此错误

熊猫错误地将字符串转换为浮点数

问题描述

1 个解决方案

解决方案1
0 2017-06-27 03:38:45

熊猫错误地将字符串转换为浮点数

问题描述

1 个解决方案

解决方案1 0 2017-06-27 03:38:45

解决方案1
0 2017-06-27 03:38:45