With a dataset such as this :
famid birth age ht
0 1 1 one 2.8
1 1 1 two 3.4
2 1 2 one 2.9
3 1 2 two 3.8
4 1 3 one 2.2
5 1 3 two 2.9
...where we've got values for a variable ht
for different categories of, for example, age
, I would like to adjust a subset of the data in df['ht']
where df['age'] == 'one'
only . And I would like to do it without creating a new column.
I've tried:
df[df['age']=='one']['ht'] = df[df['age']=='one']['ht']*10**6
But to my mild surprise the numbers don't change. Maybe because the A value is trying to be set on a copy of a slice from a DataFrame
warning is triggered in the same run. I've also tried with df.mask()
and df.where()
. But to no avail. I'm clearly failing at something very basic here, but I'd really like to know how to do this properly. There are similarly sounding questions such as Performing calculations on subset of data frame subset in Python , but the suggested solutions here are pointing towards df.groupby()
, and I don't think this necessarily is the right approach here.
Thank you for any suggestions!
Here's a fully reproducible dataset:
import pandas as pd
df = pd.DataFrame({
'famid': [1, 1, 1, 2, 2, 2, 3, 3, 3],
'birth': [1, 2, 3, 1, 2, 3, 1, 2, 3],
'ht_one': [2.8, 2.9, 2.2, 2, 1.8, 1.9, 2.2, 2.3, 2.1],
'ht_two': [3.4, 3.8, 2.9, 3.2, 2.8, 2.4, 3.3, 3.4, 2.9]
})
df = pd.wide_to_long(df, stubnames='ht', i=['famid', 'birth'], j='age',
sep='_', suffix=r'\w+')
df.reset_index(inplace = True)
Let's try this:
df.loc[df['age'] == 'one', 'ht'] *= 10**6
Output:
famid birth age ht
0 1 1 one 2800000.0
1 1 1 two 3.4
2 1 2 one 2900000.0
3 1 2 two 3.8
4 1 3 one 2200000.0
5 1 3 two 2.9
6 2 1 one 2000000.0
7 2 1 two 3.2
8 2 2 one 1800000.0
9 2 2 two 2.8
10 2 3 one 1900000.0
11 2 3 two 2.4
12 3 1 one 2200000.0
13 3 1 two 3.3
14 3 2 one 2300000.0
15 3 2 two 3.4
16 3 3 one 2100000.0
17 3 3 two 2.9
Here is a way:
df.assign(ht = df['ht'].mask(df['age'].isin(['one']),df['ht'].mul(10**6)))
by using isin()
, more values from the age column can be added.
Output:
famid birth age ht
0 1 1 one 2800000.0
1 1 1 two 3.4
2 1 2 one 2900000.0
3 1 2 two 3.8
4 1 3 one 2200000.0
5 1 3 two 2.9
6 2 1 one 2000000.0
7 2 1 two 3.2
8 2 2 one 1800000.0
9 2 2 two 2.8
10 2 3 one 1900000.0
11 2 3 two 2.4
12 3 1 one 2200000.0
13 3 1 two 3.3
14 3 2 one 2300000.0
15 3 2 two 3.4
16 3 3 one 2100000.0
17 3 3 two 2.9
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.