I have a question, how does one count the number of unique values that occur within each column of a pandas data-frame?
Say I have a data frame named df that looks like this:
1 2 3 4
a yes f c
b no f e
c yes d h
I am wanting to get output that shows the frequency of unique values within the four columns. The output would be something similar to this:
Column # of Unique Values
1 3
2 2
3 2
4 3
I don't need to know what the unique values are, just how many there are within each column.
I have played around with something like this:
df[all_cols].value_counts()
[all_cols] is a list of all the columns within the data frame. But this is counting how many times the value appears within the column.
Any advice/suggestions would be a great help. Thanks
You could apply
Series.nunique
:
>>> df.apply(pd.Series.nunique)
1 3
2 2
3 2
4 3
dtype: int64
Or you could do a groupby/nunique
on the unstacked version of the frame:
>>> df.unstack().groupby(level=0).nunique()
1 3
2 2
3 2
4 3
dtype: int64
Both of these produce a Series, which you could then use to build a frame with whatever column names you wanted.
You could try df.nunique()
>>> df.nunique()
1 3
2 2
3 2
4 3
dtype: int64
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.