简体   繁体   English

将 DataFrame 中的值替换为 None

[英]Replacing values in DataFrame with None

When creating a Pandas DataFrame with None values, they are converted to NaN :在创建具有None值的 Pandas DataFrame 时,它们将转换为NaN

> df = pd.DataFrame({'a': [0, None, 2]})
> df

      a
0   0.0
1   NaN
2   2.0

Same thing if I set a value to None by index:如果我按索引将值设置为None也是一样的:

> df = pd.DataFrame({'a': [0, 1, 2]})
> df["a"].iloc[1] = None
> df

      a
0   0.0
1   NaN
2   2.0

However, if I do a replace, weird things start happening:但是,如果我进行替换,就会开始发生奇怪的事情:

> df = pd.DataFrame({'a': [0, 1, 2, 3]})
> df["a"].replace(1, "foo")

    a
0   0
1   'foo'
2   2
3   3

> df["a"].replace(2, None)

    a
0   0
1   1
2   1
3   3

What is going on here?这里发生了什么?

According to the doc string根据文档字符串

When ``value=None`` and `to_replace` is a scalar, list or
tuple, `replace` uses the method parameter (default 'pad') to do the
replacement. So this is why the 'a' values are being replaced by 10
in rows 1 and 2 and 'b' in row 4 in this case.
The command ``s.replace('a', None)`` is actually equivalent to
``s.replace(to_replace='a', value=None, method='pad')``

If you want to actually replace with None , pass a dict:如果您想实际替换为None ,请传递一个字典:

>>> s = pd.Series([10, 'a', 'a', 'b', 'a'])

When one uses a dict as the `to_replace` value, it is like the
value(s) in the dict are equal to the `value` parameter.
``s.replace({'a': None})`` is equivalent to
``s.replace(to_replace={'a': None}, value=None, method=None)``:

>>> s.replace({'a': None})
0      10
1    None
2    None
3       b
4    None
dtype: object
s = pd.Series([10, 'a', 'a', 'b', 'a'])
s.replace({'a': None})
0      10
1    None
2    None
3       b
4    None
dtype: object

s.replace({'a': None}) is equivalent to s.replace(to_replace={'a': None}, value=None, method=None):

When value=None and to_replace is a scalar, list or tuple, replace uses the method parameter (default 'pad') to do the replacement.当 value=None 且 to_replace 是标量、列表或元组时,replace 使用方法参数(默认为“pad”)进行替换。 So this is why the 'a' values are being replaced by 10 in rows 1 and 2 and 'b' in row 4 in this case.所以这就是为什么在这种情况下,第 1 行和第 2 行中的“a”值被替换为 10,第 4 行中的“b”值被替换。 The command s.replace('a', None) is actually equivalent to s.replace(to_replace='a', value=None, method='pad'):命令 s.replace('a', None) 实际上等价于 s.replace(to_replace='a', value=None, method='pad'):

  s.replace('a', None)
    0    10
    1    10
    2    10
    3     b
    4     b
    dtype: object

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM