简体   繁体   English

熊猫 to_numeric downcast='signed' 返回 float64

[英]pandas to_numeric downcast='signed' returning float64

I have a dataset where the pandas.read_csv() processing appropriately casted some continuous numeric column/feature/variable data from object to float64 [ , int64 or uint8 ] but not others.我有一个数据集,其中 pandas.read_csv() 处理适当地将一些连续的数字列/特征/变量数据从对象投射到 float64 [, int64 或 uint8 ] 而不是其他。

So I then try and convert the column data that should have been cast as continuous numeric type, specifically int64, using the following pandas.to_numeric() call with downcast parameter specified yet I still get a float64 result.因此,我然后尝试使用以下带有指定向下转换参数的 pandas.to_numeric() 调用转换应该已转换为连续数字类型(特别是 int64)的列数据,但我仍然得到 float64 结果。

df.wc = pd.to_numeric(df.wc, errors='coerce', downcast='signed') 
# call to convert object to int64 vs float64 

Is there a typical column/feature/variable set issue that will cause that parameter setting to be ignored when attempting to cast an object type to the most specific continuous numeric type?在尝试将对象类型转换为最具体的连续数字类型时,是否存在典型的列/功能/变量集问题会导致该参数设置被忽略?

According to documentation 根据文件

... downcast that resulting data to the smallest numerical dtype possible according ... ...根据...将结果数据转换为可能的最小数字dtype

According to my experiments, it's possible to downcast to integer values like 根据我的实验,可以向下转换为整数值,例如

pd.to_numeric(pd.Series([1.0, 2.0]), downcast='unsigned')
0    1
1    2
dtype: uint8

Though, it's not possible to downcast to integer values like 不过,无法向下转换为整数值,例如

pd.to_numeric(pd.Series([1.1, 2.1]), downcast='unsigned')
0    1.1
1    2.1
dtype: float64

If you want to get int64 values in the result, then you can apply pd.Series.astype 如果要在结果中获取int64值,则可以应用pd.Series.astype

pd.Series([1.1, 2.1]).astype(int)
0    1
1    2
dtype: int64

You may be interested in 您可能感兴趣

When using使用时

pandas.to_numeric(df[some_column], errors='coerce', downcast='integer')

it seems that any "not downcastable" value in the some_column makes the whole column not downcasted.似乎 some_column 中的任何“不可向下转换”值都会使整个列不被向下转换。

One walkthrough is to separate the removal of non numeric values and the downcast to signed or int:一个演练是将非数字值的删除和向下转换为有符号或整数分开:

df[some_column]=pd.to_numeric(df[some_column], errors='coerce')
df.dropna(subset = [some_column], inplace = True)
df[some_column]=pd.to_numeric(df[some_column], downcast='integer')

First line sets non numeric values to NaN.第一行将非数值设置为 NaN。 Second line drops them in place.第二行将它们放置到位。 Third line cast them to integer.第三行将它们转换为整数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM