[英]Pandas turn last N columns into NA based on another dataframe
I have the following dataframes:我有以下数据框:
df1 = pd.DataFrame(data={'col1': ['a', 'd', 'g', 'j'],
'col2': ['b', 'c', 'i', np.nan],
'col3': ['c', 'f', 'i', np.nan],
'col4': ['x', np.nan, np.nan, np.nan]},
index=pd.Series(['ind1', 'ind2', 'ind3', 'ind4'], name='index'))
index指数 | col1 col1 | col2列2 | col3列3 | col4列4 |
---|---|---|---|---|
ind1 ind1 | a一种 | b b | c c | x X |
ind2 ind2 | d d | c c | f F | NaN钠盐 |
ind3 ind3 | g G | i一世 | i一世 | NaN钠盐 |
ind4 ind4 | j j | NaN钠盐 | NaN钠盐 | NaN钠盐 |
df2 = pd.Series(data=[True, False, True, False],
index=pd.Series(['ind1', 'ind2', 'ind3', 'ind4']))
ind1 ind1 | True真的 |
ind2 ind2 | False错误的 |
ind3 ind3 | True真的 |
ind4 ind4 | False错误的 |
How do I make the last 2 values for each row in df1
into NA, based on the boolean values of df2
?如何根据df2
的 boolean 值将df1
中每一行的最后 2 个值设为 NA?
In this case, since ind1
and ind3
are True, it would impact the same indices in df1
.在这种情况下,由于ind1
和ind3
为真,它会影响df1
中的相同索引。
index指数 | col1 col1 | col2列2 | col3列3 | col4列4 |
---|---|---|---|---|
ind1 ind1 | a一种 | b b | NaN钠盐 | NaN钠盐 |
ind2 ind2 | d d | c c | f F | NaN钠盐 |
ind3 ind3 | g G | i一世 | NaN钠盐 | NaN钠盐 |
ind4 ind4 | j j | NaN钠盐 | NaN钠盐 | NaN钠盐 |
A possible solution, based on pandas.DataFrame.mask
:一个可能的解决方案,基于pandas.DataFrame.mask
:
df1[['col3', 'col4']] = df1[['col3', 'col4']].mask(df2)
Output: Output:
col1 col2 col3 col4
index
ind1 a b NaN NaN
ind2 d c f NaN
ind3 g i NaN NaN
ind4 j NaN NaN NaN
You can use boolean indexing :您可以使用boolean 索引:
N = 2
df1.iloc[df2, -N:] = np.nan
NB.注意。 what you call df2
is actually a Series, s
/ ser
might be more appropriate as a name.你所说的df2
实际上是一个系列, s
/ ser
作为名称可能更合适。
output: output:
col1 col2 col3 col4
index
ind1 a b NaN NaN
ind2 d c f NaN
ind3 g i NaN NaN
ind4 j NaN NaN NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.