简体   繁体   中英

Recoding multiple integer variables into one in python

each record represents a person. 250.000 is diabetes, and I would like to make a DXDiabetes column if 250 appears in any of Code1, Code2, or Code3.

import pandas as pd
data_prep = pd.DataFrame({"Code1" : [250.000,276.000,401.000,414.000], 
                     "Code2" : [403.000,411.000,414.000,250.000],
                     "Code3" : [427.000,250.000,486.000,682.000]})

data_prep

However, I'm not keeping the "1" coding from Code1 as I move to Code3. DXDiabetes is only keeping the last recode.

data_prep['DXDiabetes']=data_prep['Code1'].apply(lambda x: 1 if round(x,0) == 250 else 0)
data_prep['DXDiabetes']=data_prep['Code2'].apply(lambda x: 1 if round(x,0) == 250 else None)
data_prep['DXDiabetes']=data_prep['Code3'].apply(lambda x: 1 if round(x,0) == 250 else None)


print(data_prep['DXDiabetes'].value_counts())

Is there a way to have DXDiabetes = 1 if any of Code1, Code2, or Code3 == 250?

Many thanks,

Sandra

You can use np.where , assigning a value of 1 if the condition is True and 0 if it is False . The condition checks if any of the rows for the three columns equals 250.

import numpy as np

data_prep['DXDiabetes'] = np.where(
    data_prep[['Code1', 'Code2', 'Code3']].eq(250).any(axis=1), 1, 0)

>>> data_prep
   Code1  Code2  Code3  DXDiabetes
0  250.0  403.0  427.0           1
1  276.0  411.0  250.0           1
2  401.0  414.0  486.0           0
3  414.0  250.0  682.0           1

Note that you first check for equality:

>>>> data_prep[['Code1', 'Code2', 'Code3']].eq(250)
   Code1  Code2  Code3
0   True  False  False
1  False  False   True
2  False  False  False
3  False   True  False

And then you check if any row above is True by specifying .any(axis=1) .

>>> data_prep[['Code1', 'Code2', 'Code3']].eq(250).any(axis=1)
0     True
1     True
2    False
3     True
dtype: bool

The following should work:

data_prep['DXDiabetes']=data_prep.apply(lambda x: 1 if any(i==250 for i in x) else 0, axis=1)

>>> print(data_prep)
   Code1  Code2  Code3  DXDiabetes
0  250.0  403.0  427.0           1
1  276.0  411.0  250.0           1
2  401.0  414.0  486.0           0
3  414.0  250.0  682.0           1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM