简体   繁体   English

pandas-按列名遮罩数据框

[英]pandas - mask dataframe by column name

Starting from this simple dataframe df : 从这个简单的数据框df

col1,col2
1,3
2,1
3,8

I would like to apply a boolean mask in function of the name of the column. 我想在列名的函数中应用一个布尔mask I know that it is easy for values: 我知道价值观很容易:

mask = df <= 1

df = df[mask]

which returns: 返回:

mask: 面具:

    col1   col2
0   True  False
1  False   True
2  False  False

df: df:

   col1  col2
0     1   NaN
1   NaN     1
2   NaN   NaN

as expected. 如预期的那样。 Now I would like to obtain a boolean mask based on the column name, something like: 现在,我想根据列名获取一个布尔掩码,例如:

mask = df == df['col_1']

which should return: 应该返回:

mask 面具

    col1   col2
0   True  False
1   True  False
2   True  False

EDIT: 编辑:

This seems weird, but I need those kind of masks to later filtering by columns seaborn heatmaps. 这似乎很奇怪,但是我需要使用这些掩码,以便以后通过列seaborn热图进行过滤。

As noted in the comments, situations where you would need to get a "mask" like that seem rare (and chances are, you not in one of them). 如评论中所述,在这种情况下,您将需要获得类似“蒙版”的情况似乎很少见(而且很可能您不在其中之一)。 Consequently, there is probably no nice "built-in" solution for them in Pandas. 因此,在Pandas中可能没有适合他们的好的“内置”解决方案。

None the less, you can achieve what you need, using a hack like the following, for example: 但是,您可以使用如下所示的hack来实现所需的功能:

mask = (df == df) & (df.columns == 'col_1')

Update: . 更新: As noted in the comments, if your data frame contains nulls, the mask computed this way will always be False at the corresponding locations. 如注释中所述,如果您的数据帧包含空值,则以这种方式计算的掩码在相应位置将始终为False If this is a problem, the safer option is: 如果这是一个问题,更安全的选择是:

mask = ((df == df) | df.isnull()) & (df.columns == 'col_1')

You could transpose your dataframe than compare it with the columns and then transpose back. 您可以转置数据框,然后将其与列进行比较,然后转回。 A bit weird but working example: 有点奇怪但可行的示例:

import pandas as pd
from io import StringIO

data = """
col1,col2
1,3
2,1
3,8
"""

df = pd.read_csv(StringIO(data))
mask = (df.T == df['col1']).T

In [176]: df
Out[176]:
   col1  col2
0     1     3
1     2     1
2     3     8


In [178]: mask
Out[178]:
   col1   col2
0  True  False
1  True  False
2  True  False

EDIT 编辑

I found another answer for that, you could use isin method: 我为此找到了另一个答案,您可以使用isin方法:

In [41]: df.isin(df.col1)
Out[41]:
   col1   col2
0  True  False
1  True  False
2  True  False

EDIT2 编辑2

As @DSM show in the comment that these two cases not working correctly. 如@DSM在注释中所示,这两种情况无法正常工作。 So you should use @KT. 因此,您应该使用@KT。 method. 方法。 But.. Let's play more with transpose: 但是..让我们更多地使用移调:

df.col2 = df.col1

In [149]: df
Out[149]:
   col1  col2
0     1     1
1     2     2
2     3     3

In [147]: df.isin(df.T[df.columns == 'col1'].T)
Out[147]:
   col1   col2
0  True  False
1  True  False
2  True  False

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM