简体   繁体   English

在Pandas数据框中查找行唯一的列

[英]Finding columns which are unique to a row in Pandas dataframe

I have a dataframe of the below structure. 我有以下结构的数据框。 I want to get the column numbers which are unique to a particular row. 我想获取特定行唯一的列号。

1 1 0 1 1 1 0 0 0
0 1 0 1 0 0 0 0 0
0 1 0 0 1 0 0 0 0
1 0 0 0 1 0 0 0 1
0 0 0 0 0 0 1 1 0
1 0 0 0 1 0 0 0 0

In the above example I should get coln6, coln7, coln8, coln9 (as there is only one row which has a value specific to these columns). 在上面的示例中,我应该获取coln6,coln7,coln8,coln9(因为只有一行具有特定于这些列的值)。 Also I should be able to distinguish among the columns like coln7 and coln8 should group together as they are unique to the same row. 另外,我应该能够区分coln7和coln8之类的列,因为它们对于同一行是唯一的。 Is there an efficient solution in Python for this? 在Python中是否有一个有效的解决方案?

Here is my first approach: 这是我的第一种方法:

import numpy as np
import pandas as pd

df = pd.DataFrame(np.array([
    1,1,0,1,1,1,0,0,0,
    0,1,0,1,0,0,0,0,0,
    0,1,0,0,1,0,0,0,0,
    1,0,0,0,1,0,0,0,1,
    0,0,0,0,0,0,1,1,0,
    1,0,0,0,1,0,0,0,0]).reshape(6,9))

print df.sum(axis=0).apply(lambda x: True if x == 1 else False)

Output: 输出:

0    False
1    False
2    False
3    False
4    False
5     True
6     True
7     True
8     True
dtype: bool

You can call sum on the df and compare against 1 and use this to mask the columns: 您可以在df上调用sum并将其与1进行比较,并使用它来屏蔽列:

In [19]:
df.columns[df.sum(axis=0) == 1]

Out[19]:
Int64Index([5, 6, 7, 8], dtype='int64')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM