I have a pandas data frame that looks like this -
Col1 | Col2 | INDX |
---|---|---|
10 | 20 | 0 |
30 | 40 | 1 |
50 | 60 | 1 |
70 | 80 | 0 |
For each row I want to select value from either Col1 or Col2 based on value in INDX. So the output in above case should be- [10,40,60,70]
I did this by looping through each row of dataframe, but it's quite slow. Is there is a faster way to accomplish this?
Dummy test code -
for i in np.arange(0, df.shape[0]):
print(df.iloc[i, df['INDX'][i]])
Try lookup
:
cols = df.columns[:2]
df.lookup(df.index, cols[df.INDX])
Output:
array([10, 40, 60, 70])
Update As commented by Scott, lookup
is deprecated. We can resolve to numpy indexing:
df[cols].to_numpy()[np.arange(len(df)), df['INDX']]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.