简体   繁体   中英

Apply several functions to a grouped by dataframe in python

I have a dataset with some columns which I am using for grouping the database. There are two more columns; one has dtype object and other is numerical. I want to find the number of unique values for each group for each column and also the most common value.

# Typo in code next line removed
df = pd.DataFrame({'A': ['foo', 'foo', 'foo', 'foo', 'bar', 'bar','bar','bar',], 'C_object':['str1', 'str2', 'str2', 'str2','str1', 'str1', 'str1', 'str2'], 'D_num': [10, 2, 2, 2, 10, 10, 10, 2]})
d = df.groupby('A')
g = d['C_object', 'D_num'].transform(unique)

Expected Output 在此处输入图片说明 This doesn't work.

Try this:

import pandas as pd

df = pd.DataFrame({'A': ['foo', 'foo', 'foo', 'foo', 'bar', 'bar','bar','bar',], 'C_object':['str1', 'str2', 'str2', 'str2','str1', 'str1', 'str1', 'str2'], 'D_num': [10, 2, 2, 2, 10, 10, 10, 2]})

df2=pd.DataFrame({'C_object_len_unique': df.groupby('A')['C_object'].apply(lambda x: len(x.unique())), \
                  'C_object_most_common': df.groupby('A')['C_object'].agg(lambda x:x.value_counts().index[0]), \
                  'D_num_len_unique' : df.groupby('A')['D_num'].apply(lambda x: len(x.unique())), \
                  'D_num_most_common': df.groupby('A')['D_num'].agg(lambda x:x.value_counts().index[0]) \
                  }).reset_index()
print df2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM