如何在Pandas数据框中定义默认缺失值

Question

I want to read a dataframe with given datatype and missing values, but the follwing code is wrong. 我想读取具有给定数据类型和缺失值的数据框，但是下面的代码是错误的。 I have no idea, why this happens! 我不知道为什么会这样！

myText = StringIO("""1,2
3,\N
5,6""")

myDf = pd.read_csv(myText, header=None, names=["a1","a2"], na_values=["\N"], dtype={"a1":"int", "a2":"int"})

I got the error message: 我收到错误消息：

ValueError: Integer column has NA values in column 1

If I remove the dtype option dtype={"a1":"int", "a2":"int"} , then it works fine. 如果我删除dtype选项dtype={"a1":"int", "a2":"int"} ，那么它可以正常工作。 Does the integer column don't allow missing values? 整数列是否不允许缺少值？

Answer 1

Integer doesn't allow missing values. 整数不允许缺少值。 Float allows missing values. 浮点数允许缺失值。 If you need it to be integers, you'll need to use a sentinel for the missing ones, like 0 or 99999999 or something (not recommended). 如果您需要将其设为整数，则需要为缺少的内容使用前哨，例如0或99999999或其他内容（不建议使用）。 Otherwise, use a type like float64 that allows out-of-band values like NaN. 否则，请使用类似float64的类型，该类型允许使用NaN等带外值。

如何在Pandas数据框中定义默认缺失值

问题描述

1 个解决方案

解决方案1
3 已采纳 2017-02-04 19:16:18

如何在Pandas数据框中定义默认缺失值

问题描述

1 个解决方案

解决方案1 3 已采纳 2017-02-04 19:16:18

解决方案1
3 已采纳 2017-02-04 19:16:18