简体   繁体   中英

Apply a function to every column of a dataframe in pandas

I have this:

df = DataFrame(dict(person= ['andy', 'rubin', 'ciara', 'jack'], 
     item = ['a', 'b', 'a', 'c'], 
     group= ['c1', 'c2', 'c3', 'c1'], 
     age= [23, 24, 19, 49]))
df:

    age group item person
0   23  c1    a    andy
1   24  c2    b    rubin
2   19  c3    a    ciara
3   49  c1    c    jack

what I want to do, is to get the length of unique items in each column. Now I know I can do something like:

len(df.person.unique())

for every column.

Is there a way to do this in one go for all columns?

I tried to do:

for column in df.columns:
    print(len(df.column.unique()))

but I know this is not right.

How can I accomplish this?

you want pd.Series.nunique

df.apply(pd.Series.nunique)

age       4
group     3
item      3
person    4
dtype: int64

You can use:

for column in df:
    print(len(df[column].unique()))

4
3
3
4      

Or:

for column in df:
    print(df[column].nunique())

4
3
3
4

You can the number of unique items in each column as:

for column in df.columns:
    print(len(df[column].unique()))

为什么不是这样的,

df.nunique()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM