简体   繁体   中英

How do I create one column based on the names of multiple other columns in python?

Say I have the following data frame:

ID   Brick    Vinyl     Stone
1    Yes      No         No
2    No       Yes        No
3    No       No         Yes
4    Yes      Yes        No
5    No       No         No

How would I create a new column based on the names of these columns so that I ended up with the following?

ID   Brick    Vinyl     Stone    Type
1    Yes      No         No      Brick
2    No       Yes        No      Vinyl
3    No       No         Yes     Stone
4    Yes      Yes        No      Multiple
5    No       No         No      Other

Note that IDs 4 and 5 are either 'Yes' for multiple columns or are all 'No'. The response I have recorded in 'Type' for those two entries doesn't have to be 'Multiple' or 'Other' specifically - if there is a default way of recording the desired information that will work just as well. Thank you!

You could do

In [146]: s = df[['Brick', 'Vinyl', 'Stone']].eq('Yes')

In [147]: sm = s.sum(1)

In [148]: df['Type'] = np.where(sm.eq(0), 'Other', 
                                np.where(sm.eq(2), 'Multiple', s.idxmax(1)))

In [149]: df
Out[149]:
   ID Brick Vinyl Stone      Type
0   1   Yes    No    No     Brick
1   2    No   Yes    No     Vinyl
2   3    No    No   Yes     Stone
3   4   Yes   Yes    No  Multiple
4   5    No    No    No     Other

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM