I have this:
df = DataFrame(dict(person= ['andy', 'rubin', 'ciara', 'jack'],
item = ['a', 'b', 'a', 'c'],
group= ['c1', 'c2', 'c3', 'c1'],
age= [23, 24, 19, 49]))
df:
age group item person
0 23 c1 a andy
1 24 c2 b rubin
2 19 c3 a ciara
3 49 c1 c jack
what I want to do, is to get the length of unique items in each column. Now I know I can do something like:
len(df.person.unique())
for every column.
Is there a way to do this in one go for all columns?
I tried to do:
for column in df.columns:
print(len(df.column.unique()))
but I know this is not right.
How can I accomplish this?
you want pd.Series.nunique
df.apply(pd.Series.nunique)
age 4
group 3
item 3
person 4
dtype: int64
You can use:
for column in df:
print(len(df[column].unique()))
4
3
3
4
Or:
for column in df:
print(df[column].nunique())
4
3
3
4
You can the number of unique items in each column as:
for column in df.columns:
print(len(df[column].unique()))
为什么不是这样的,
df.nunique()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.