I have a dataset example shown here:
df = pd.DataFrame({'product' : ['a', 'a', 'c', 'c', 'd', 'b', 'a', 'b', 'c'],
'unit' : ['ng/L', 'k/uL', 'x10(3)/mcL', 'x10(3)/mcL', 'k/uL', 'ng/L', 'ng/L', 'sss', 'sss'],
'value' : [0.2, 1.0, 67.0, 71.5, 23.2, 71.0, 0.44, 59.3, 12.7],
'market_penetration_rate' : [0.82, 0.64, 77.5, 12.5, 22.5, 88.0, 0.34, 98.2, 87.4]})
I want to get all rows where the product = 'a' and the 'unit' = 'ng/l' and convert the value and the unit to be value/1000 and unit = 'ng/ml'
I nearly have it working but I don't know how to get the value and divide by 1000 in the code below
df.loc[(df['product'] == 'a') & (df['unit']== 'ng/L'), ['value', 'unit']] = ['value'/1000 ,'ng/mL']
In place of 'value'/1000 what do I put? If I just used a constant in the square brackets then it works, but I want to grab the value it already is and divide.
You are close, but I think you need to separate the update commands for different operations:
mask = df['product'].eq('a') & df['unit'].eq('ng/L')
# update value
df.loc[mask, 'value'] /= 1000
# update unit
df.loc[mask,'unit']='ng/mL'
Output:
product unit value market_penetration_rate
0 a ng/mL 0.00020 0.82
1 a k/uL 1.00000 0.64
2 c x10(3)/mcL 67.00000 77.50
3 c x10(3)/mcL 71.50000 12.50
4 d k/uL 23.20000 22.50
5 b ng/L 71.00000 88.00
6 a ng/mL 0.00044 0.34
7 b sss 59.30000 98.20
8 c sss 12.70000 87.40
Can use df.assign and np.where to compute required values
df=df.assign(unit=np.where((df['product'].eq('a'))&(df.unit.eq('ng/L')),'ng/ml', df.unit),value=(np.where((df['product'].eq('a'))&(df.unit.eq('ng/L')),df.value/1000, df.value)))
pr
oduct unit value market_penetration_rate
0 a ng/ml 0.20 0.82
1 a k/uL 1.00 0.64
2 c x10(3)/mcL 67.00 77.50
3 c x10(3)/mcL 71.50 12.50
4 d k/uL 23.20 22.50
5 b ng/L 71.00 88.00
6 a ng/ml 0.44 0.34
7 b sss 59.30 98.20
8 c sss 12.70 87.40
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.