I am working with ICD-9 codes for a data mining project using python and I am having trouble converting the specific codes into categories. For example, I am trying to change everything that's between 001 and 139 with 0, everything that's between 140 and 239 with 1, etc
This is what I have tried:
df = df.replace({'diag_1' : {'(1-139)' : 0, '(140-239)' : 1}})
You can use pd.cut
to achieve this:
In [175]:
df = pd.DataFrame({'value':np.random.randint(0,20,10)})
df
Out[175]:
value
0 12
1 2
2 10
3 5
4 19
5 2
6 8
7 14
8 12
9 16
here we set bin intervals of (0-5) (5-15), (15-20):
In [183]:
df['new_value'] = pd.cut(df['value'], bins=[0,5,15,20], labels=[0,1,2])
df
Out[183]:
value new_value
0 12 1
1 2 0
2 10 1
3 5 0
4 19 2
5 2 0
6 8 1
7 14 1
8 12 1
9 16 2
I think in your case the following should work:
df['diag_1']= pd.cut(df['diag_1'], [1,140,240] , labels=[1,2,3])
you can set the bins and labels dynamically using np.arange
or similar
There is nothing wrong with an if-statement.
newvalue = 1 if oldvalues <= 139 else 2
Apply this function as a lambda expression with map
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.