How to effectively select pandas dataframe columns which have only 1 unique value?
I'm aware of DataFrame and Series.nunique()
I think need DataFrame.nunique
for boolean mask and select by loc
with boolean indexing
:
df = pd.DataFrame({'A':list('abcdef'),
'B':[4,5,4,5,5,4],
'C':[7,8,9,4,2,3],
'D':[1] * 6,
'E':[5,3,6,9,2,4],
'F':list('aaaaaa')})
print (df)
A B C D E F
0 a 4 7 1 5 a
1 b 5 8 1 3 a
2 c 4 9 1 6 a
3 d 5 4 1 9 a
4 e 5 2 1 2 a
5 f 4 3 1 4 a
df = df.loc[:, df.nunique() == 1]
#alternatives
#df = df.loc[:, df.apply(lambda x: x.nunique()) == 1]
#df = df.loc[:, df.apply(lambda x: len(x.unique())) == 1]
print (df)
D F
0 1 a
1 1 a
2 1 a
3 1 a
4 1 a
5 1 a
Use DataFrame.uniques() to count distinct observations over requested axis.
df = pd.DataFrame({'A': list('abcdef'),
'B': [4, 5, 4, 5, 5, 4],
'C': [7, 8, 9, 4, 2, 3],
'D': [1] * 6,
'E': [5, 3, 6, 9, 2, 4],
'F': list('aaaaaa')})
print(df)
df.columns[df.nunique() <= 1]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.