Python: In DataFrame, add value in a new column for row with highest value in another column and string identical in a third one

Question

I'm trying to find an efficient way to determine in a DataFrame which row have the highest value in a column (value) when their "String" in another column (String) are identical, creating a new column (motif) with this information for later use.

Here an example of a dataframe:

    String    N   value
0   EXAM     10     250
1   EXAMP    20     350
2   EXAMPLE  30     450
3   EXAMPLE  40     400
4   EXA      50     300
5   EX       60     100

Here is what I'm looking for:

    String    N   value  motif
0   EXAM     10     250    Nan
1   EXAMP    20     350    Nan
2   EXAMPLE  30     450      1
3   EXAMPLE  40     400    Nan
4   EXA      50     300    Nan
5   EX       60     100    Nan

I tried to work with a split apply combine method

def group_motif(df):
    if df.groupby(['String']).size() > 1:
        "something like for row with the highest value in column ['value']":
            "create a new column in df called ['motif'] and add value = 1 in the row

Then I was thinking of doing a groupby.apply of this function and then combine the different groups but I can't get it right.

Is there an efficient way to achieve that other than using groupby ?

Answer 1

IIUC then you can groupby on 'String', filter it and then call idxmax to return the row labels that have the max value and assign those rows to 1 :

In [201]:
df.loc[df.groupby('String').filter(lambda x: len(x) > 1)['value'].idxmax(), 'motif'] = 1
df

Out[201]:
    String   N  value  motif
0     EXAM  10    250    NaN
1    EXAMP  20    350    NaN
2  EXAMPLE  30    450      1
3  EXAMPLE  40    400    NaN
4      EXA  50    300    NaN
5       EX  60    100    NaN

Python: In DataFrame, add value in a new column for row with highest value in another column and string identical in a third one

Question

1 answers

solution1
1 ACCPTED 2016-02-12 09:05:29

Python: In DataFrame, add value in a new column for row with highest value in another column and string identical in a third one

Question

1 answers

solution1 1 ACCPTED 2016-02-12 09:05:29

solution1
1 ACCPTED 2016-02-12 09:05:29