简体   繁体   中英

Summary data for pandas dataframe

Describe() doesn't do exactly what I'd like - so I'm rolling my own version.

The following works fine apart from the final metric 'Num Unique Values' which is returning numbers but they are not correct - I guess I'm using apply incorrectly?

pd.DataFrame({
        'Max':d.max(), 
        'Min':d.min(), 
        'Count':d.count(axis = 0),
        'Count Null':d.isnull().sum(),
        'Count Zero':d[d==0].count(),
        'Num Unique Values':d.apply(lambda x: x.nunique())
    }) 

For me it works nice:

print(df.apply(lambda x: x.nunique()))

Sample:

df = pd.DataFrame({'A':[1,2,2,1],
                   'B':[4,5,6,4],
                   'C':[7,8,9,1],
                   'D':[1,3,5,9]})

print (df)
   A  B  C  D
0  1  4  7  1
1  2  5  8  3
2  2  6  9  5
3  1  4  1  9

print (df.apply(lambda x: x.nunique()))
A    2
B    3
C    4
D    4
dtype: int64

Another solution:

print (df.apply(lambda x: len(x.unique())))
A    2
B    3
C    4
D    4
dtype: int64

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM