Say I have the following data frame:
ID Brick Vinyl Stone
1 Yes No No
2 No Yes No
3 No No Yes
4 Yes Yes No
5 No No No
How would I create a new column based on the names of these columns so that I ended up with the following?
ID Brick Vinyl Stone Type
1 Yes No No Brick
2 No Yes No Vinyl
3 No No Yes Stone
4 Yes Yes No Multiple
5 No No No Other
Note that IDs 4 and 5 are either 'Yes' for multiple columns or are all 'No'. The response I have recorded in 'Type' for those two entries doesn't have to be 'Multiple' or 'Other' specifically - if there is a default way of recording the desired information that will work just as well. Thank you!
You could do
In [146]: s = df[['Brick', 'Vinyl', 'Stone']].eq('Yes')
In [147]: sm = s.sum(1)
In [148]: df['Type'] = np.where(sm.eq(0), 'Other',
np.where(sm.eq(2), 'Multiple', s.idxmax(1)))
In [149]: df
Out[149]:
ID Brick Vinyl Stone Type
0 1 Yes No No Brick
1 2 No Yes No Vinyl
2 3 No No Yes Stone
3 4 Yes Yes No Multiple
4 5 No No No Other
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.