简体   繁体   中英

Pandas GroupBy Multiple Columns and apply function on one column

I'm working on this type of pandas dataframe:

Chapter label_code annotator rounded_lenght
Chapter1 1 1 159
Chapter 2 3 2 30
Chapter 2 4 2 150

I'm trying to apply Krippendorff's alpha on this data frame to calculate inter-annotator agreement for every chapter of the book and for every emotion separately. Here is the function to calculate inter-annotator agreement on this data frame.

def krippendorffs_emotion(chapters):
return sf.calculate_krippendorffs_alpha_for_df(chapters, experiment_col='rounded_length', annotator_col='annotator', class_col='label_code')

The columns:
'label_code' is eight different emotions encoded as a number.
'annotator' encode different annotators
'rounded_lenght' identify the parts of my text
and the 'chapter' column encodes different chapters of the book.

To apply this function I need to group this data frame by chapters and I need separate results for every label_code

This is what I've tried.

grouped_df = emo_chapters_df.groupby(['Chapter','label_code']).apply(kripendorf_emotion(emo_chapters_df))
grouped_df

When I run this code I receive:

TypeError: 'numpy.float64' object is not callable

Thank you in advance for your help.

IIUC, all you want to do is to have a single return on your multiple groups name.

In this case a simple slice of your dataframe shall limit the results to series possibilitating a single return

emo_chapters_df.groupby(['Chapter','label_code'])['Column you want to be returned from multiple groupby'].apply(your_funct)

Now if you want to return the df object, and formulate a condition on a single give column, that is another question.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM