简体   繁体   中英

Add New Column that returns the min value for unique values in another column

I'm attempting to create a new column called 'jolly' that will populate the min value from the HIR_ValuePrice column for each unique value in RH_RNo.

Here is my current attempt:

def evepricefav(races):
    for race in races:
        clv.loc[clv['RH_RNo'] == race]['HIR_EveningPrice']
    
clv['jolly'] = clv.apply(evepricefav, axis=1)

Here is a sample of the dataframe. You can see a failed attempt has populated the jolly column with 1.4.

        RH_RNo   HIR_BSP   HIR_EveningPrice  value    jolly
794565  189631    28.75              26.0 -0.269565    1.4
794566  189631    15.38              13.0 -0.414824    1.4
794567  189631    15.00               6.0 -0.533333    1.4
794568  189631     4.80               5.0  0.458333    1.4
794569  189631     9.85              13.0  0.522843    1.4
794570  189631     4.30               9.0  0.627907    1.4
794571  189631     5.45               6.0  0.467890    1.4
794572  189631    34.00              17.0 -0.500000    1.4
794573  189631    13.00              11.0 -0.153846    1.4
794574  189634    31.77               9.0 -0.527856    1.4
794575  189634    60.00              26.0 -0.433333    1.4
794576  189634    13.50              17.0  0.925926    1.4
794577  189634     9.20              11.0 -0.130435    1.4
794578  189634     9.80               8.0 -0.081633    1.4
794579  189634    10.00              17.0  0.700000    1.4
794580  189634    11.79              17.0  0.102629    1.4
794581  189634    29.60              21.0  0.148649    1.4
794582  189634     2.99               3.5  0.337793    1.4
794583  189634     8.48               6.0 -0.292453    1.4
794584  189637    18.24              11.0 -0.396930    1.4

You can groupby according the RH_RNo column and then use .transform('min') on 'HIR_EveningPrice' :

df['jolly'] = df.groupby('RH_RNo')['HIR_EveningPrice'].transform('min')
print(df)

Prints:

        id  RH_RNo  HIR_BSP  HIR_EveningPrice     value  jolly
0   794565  189631    28.75              26.0 -0.269565    5.0
1   794566  189631    15.38              13.0 -0.414824    5.0
2   794567  189631    15.00               6.0 -0.533333    5.0
3   794568  189631     4.80               5.0  0.458333    5.0
4   794569  189631     9.85              13.0  0.522843    5.0
5   794570  189631     4.30               9.0  0.627907    5.0
6   794571  189631     5.45               6.0  0.467890    5.0
7   794572  189631    34.00              17.0 -0.500000    5.0
8   794573  189631    13.00              11.0 -0.153846    5.0
9   794574  189634    31.77               9.0 -0.527856    3.5
10  794575  189634    60.00              26.0 -0.433333    3.5
11  794576  189634    13.50              17.0  0.925926    3.5
12  794577  189634     9.20              11.0 -0.130435    3.5
13  794578  189634     9.80               8.0 -0.081633    3.5
14  794579  189634    10.00              17.0  0.700000    3.5
15  794580  189634    11.79              17.0  0.102629    3.5
16  794581  189634    29.60              21.0  0.148649    3.5
17  794582  189634     2.99               3.5  0.337793    3.5
18  794583  189634     8.48               6.0 -0.292453    3.5
19  794584  189637    18.24              11.0 -0.396930   11.0

Use the following code

data1 = [1,1,2,1,2]
data2 = [7,2,8,1,3]
import pandas as pd 

df = pd.DataFrame(columns=["a","b"])
df['a'] = data1
df['b'] = data2

dfc = df.groupby('a')['b']
df = df.assign(jolly=dfc.transform(max))
print(df)

Of course set your var names there :)

Output for the sample data:

 ab jolly 0 1 7 7 1 1 2 7 2 2 8 8 3 1 1 7 4 2 3 8

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM