I have a pandas dataframe data that looks like this
MED1 MED2 MED3 MED4 MED5
0 60735 24355 33843 16475 9995
1 10126 5789 17165 90000 90000
2 5789 19675 30553 90000 90000
3 60735 17865 34495 90000 90000
4 19675 5810 90000 90000 90000
I want to create a new bool column "med" that has True/False based on 60735 in the columns MED1...MED5 I am trying this and am not sure how to make it work.
DF['med'] = (60735 in [DF['MED1'], DF['MED2']])
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
MED1..MED5 represent drugs being taken by a patient at a hospital visit. I have a list of about 20 drugs for which I need to know if the patien was taking them. Each drug is coded with a number but has a name. A nice solution would look something like (below) but how do I do this with pandas.
drugs = {'drug1':60735, 'drug2':5789}
for n in drugs.keys():
DF[n] = drugs[n] in DF[['MED1', 'MED2', 'MED3', 'MED4', 'MED5']]
@Mai's answer will of course work - it may be a bit more standard to write it like this, with the |
operator.
df['med'] = (df['MED1'] == 60735) | (df['MED1'] == 60735)
If you want to check for a value in all (or many) columns, you could also use isin
as below. The isin
checks whether the value in the list is in each cell, and the any(1)
returns True if any element in each row is True.
df['med'] = df.isin([60735]).any(1)
Edit: Based on your edited question, would this work?
for n in drugs:
df[n] = df[['MED1','MED2','MED3','MED4','MED5']].isin([drugs[n]]).any(1)
I am still confused. But part of what you want may be this:
import numpy as np
DF['med'] = np.logical_or(DF['MED1'] == 60735, DF['MED2'] == 60735)
Here are a few %timeit
comparisons of some methods to return bools from a dataframe column.
In [2]: %timeit df['med'] = [bool(x) if int(60735) in x else False for x in enumerate(df['MED1'])]
1000 loops, best of 3: 379 µs per loop
In [3]: %timeit df['med'] = (df['MED1'] == 60735)
1000 loops, best of 3: 649 µs per loop
In [4]: %timeit df['med'] = df['MED1'].isin([60735])
1000 loops, best of 3: 404 µs per loop
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.