简体   繁体   中英

Pandas dataframe scale column based on another column

I've got a Dataframe that looks like this:

    cat   val
0     1    10
1     1     4
2     2     6
3     2     2
4     1     8
5     2    12

Where cat is category, and val is value. I would like to create a column, called scaled , that is linearly scaled/normalized to 0-1, on a per-category basis. I know how to do the former - ((val - min) / (max - min)) - at the column level, and I also know how to perform operations on a per-category basis, I just don't know how to combine the two. The desired result is:

    cat   val  scaled
0     1    10       1  
1     1     4       0
2     2     6     0.4
3     2     2       0
4     1     8   0.667
5     2    12       1

Ideally I'd like to stick to using Pandas only.

Any help would be appreciated, thank you!

Your scaling is to subtract the min and divide by the range, so use groupby + transform to broadcast those properties back to every row for that group and do the math.

import numpy as np

gp = df.groupby('cat')['val']

df['scaled'] = (df['val'] - gp.transform(min))/gp.transform(np.ptp)

   cat  val    scaled
0    1   10  1.000000
1    1    4  0.000000
2    2    6  0.400000
3    2    2  0.000000
4    1    8  0.666667
5    2   12  1.000000

For aggregations that reduce to a scalar, groupby + agg/apply reduces to a single row per group; however groupby + transform returns a like-Indexed Series so that it aligns to the original DataFrame.

gp.min()
#cat
#1    4
#2    2
#Name: val, dtype: int64

gp.transform(min)
#0    4
#1    4
#2    2
#3    2
#4    4
#5    2
#Name: val, dtype: int64

You can use the following lines of code to do the scaling based on another column

import pandas as pd

df = pd.DataFrame({'Group': [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3], 'Values': [1, 4, -2, 7, 3, 4, 1, -5, 12, 4, 10, 2, 6, 20, 15]})

# Normalize around mean
df['mean_normal'] = df.groupby('Group').transform(lambda x: (x - x.mean()/ x.std()))
# Normalize between 0 and 1
df['min_max_normal'] = df.groupby('Group').transform(lambda x: ((x - x.min())/ (x.max() - x.min())))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM