简体   繁体   English

Pandas - 基于Boollean DataFrame替换DataFrame中的值

[英]Pandas - Replace values in a DataFrame Based on a Boollean DataFrame

I'm using Pandas v0.20.2 and I have DataFrame, like the following: 我正在使用Pandas v0.20.2并且我有DataFrame,如下所示:

df = pd.DataFrame(dict(a=[0,1], b=[3,4], c=[6,7]), 
              index=['spam', 'ham'])
#       a  b  c
# spam  0  3  6
# ham   1  4  7

And I have another DataFrame that is a mask: 我有另一个掩码的DataFrame:

mask = pd.DataFrame(dict(a=[True,False], b=[True,True]), 
                index=['spam', 'ham'])
#           a     b
# spam   True  True
# ham   False  True

And I want to set the values in df equal to 999 where it is True in the mask . 我想将df的值设置为等于999 ,其中maskTrue

I thought that the following would work: 我认为以下内容可行:

df[mask] = 999

But it doesn't. 但事实并非如此。 I get the error below: 我收到以下错误:

ValueError                                Traceback (most recent call last)
<ipython-input-65-503f937859ab> in <module>()
----> 1 df[mask] = 999

/home/gbra/anaconda3/envs/outer_disk/lib/python2.7/site-packages/pandas/core/frame.pyc in __setitem__(self, key, value)
   2326             self._setitem_array(key, value)
   2327         elif isinstance(key, DataFrame):
-> 2328             self._setitem_frame(key, value)
   2329         else:
   2330             # set column

/home/gbra/anaconda3/envs/outer_disk/lib/python2.7/site-packages/pandas/core/frame.pyc in _setitem_frame(self, key, value)
   2364         self._check_inplace_setting(value)
   2365         self._check_setitem_copy()
-> 2366         self._where(-key, value, inplace=True)
   2367 
   2368     def _ensure_valid_index(self, value):

/home/gbra/anaconda3/envs/outer_disk/lib/python2.7/site-packages/pandas/core/generic.pyc in _where(self, cond, other, inplace, axis, level, try_cast, raise_on_error)
   5096             for dt in cond.dtypes:
   5097                 if not is_bool_dtype(dt):
-> 5098                     raise ValueError(msg.format(dtype=dt))
   5099 
   5100         cond = cond.astype(bool, copy=False)

ValueError: Boolean array expected for the condition, not float64

I would appreciate any help on this. 我将不胜感激任何帮助。

You can reindex the mask to have the same shape as df, and then use df.mask : 您可以重新索引蒙版以使其具有与df相同的形状,然后使用df.mask

df.mask(mask.reindex(df.index, df.columns, fill_value=False), 999)
Out: 
        a    b  c
spam  999  999  6
ham     1  999  7

At that point, regular indexing should also work: 此时,常规索引也应该有效:

df[mask.reindex(df.index, df.columns, fill_value=False)] = 999

This will do the job: 这将完成工作:

df = pd.DataFrame(dict(a=[0,1], b=[3,4], c=[6,7]), 
              index=['spam', 'ham'])
mask = pd.DataFrame(dict(a=[True,False], b=[True,True]), 
                index=['spam', 'ham'])
df.iloc[mask] = 999

Then df is 然后df

        a   b     c
spam    999 999   6
ham     1   999   7

另一种解决方案,无需更新mask

df[mask.columns] = df[mask.columns].mask(mask, 999)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM