[英]Use None instead of np.nan for null values in pandas DataFrame
I have a pandas DataFrame with mixed data types.我有一个混合数据类型的 pandas DataFrame。 I would like to replace all null values with None (instead of default np.nan).
我想用 None 替换所有空值(而不是默认的 np.nan)。 For some reason, this appears to be nearly impossible.
出于某种原因,这似乎几乎是不可能的。
In reality my DataFrame is read in from a csv, but here is a simple DataFrame with mixed data types to illustrate my problem.实际上,我的 DataFrame 是从 csv 读入的,但这里有一个简单的 DataFrame 混合数据类型来说明我的问题。
df = pd.DataFrame(index=[0], columns=range(5))
df.iloc[0] = [1, 'two', np.nan, 3, 4]
I can't do:我不能这样做:
>>> df.fillna(None)
ValueError: must specify a fill method or value
nor:也不:
>>> df[df.isnull()] = None
TypeError: Cannot do inplace boolean setting on mixed-types with a non np.nan value
nor:也不:
>>> df.replace(np.nan, None)
TypeError: cannot replace [nan] with method pad on a DataFrame
I used to have a DataFrame with only string values, so I could do:我曾经有一个只有字符串值的 DataFrame,所以我可以这样做:
>>> df[df == ""] = None
which worked.这有效。 But now that I have mixed datatypes, it's a no go.
但是现在我有混合数据类型,这是不行的。
For various reasons about my code, it would be helpful to be able to use None as my null value.由于我的代码的各种原因,能够使用 None 作为我的 null 值会很有帮助。 Is there a way I can set the null values to None?
有没有办法可以将空值设置为无? Or do I just have to go back through my other code and make sure I'm using np.isnan or pd.isnull everywhere?
还是我只需要返回我的其他代码并确保我在任何地方都使用 np.isnan 或 pd.isnull ?
Use pd.DataFrame.where
使用
pd.DataFrame.where
Uses df
value when condition is met, otherwise uses None
满足条件时使用
df
值,否则使用None
df.where(df.notnull(), None)
Expanding on the accpeted answer.. When you also need to catch NaN
values within numeric dtype columns, you may need to change dtype to object
first:扩展接受的答案.. 当您还需要在数字 dtype 列中捕获
NaN
值时,您可能需要先将 dtype 更改为object
:
df.astype(object).where(df.notna(), None)
as per original reply by @BENNY根据@BENNY 的原始回复
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.