How do I create one column based on the names of multiple other columns in python?

Question

Say I have the following data frame:

ID   Brick    Vinyl     Stone
1    Yes      No         No
2    No       Yes        No
3    No       No         Yes
4    Yes      Yes        No
5    No       No         No

How would I create a new column based on the names of these columns so that I ended up with the following?

ID   Brick    Vinyl     Stone    Type
1    Yes      No         No      Brick
2    No       Yes        No      Vinyl
3    No       No         Yes     Stone
4    Yes      Yes        No      Multiple
5    No       No         No      Other

Note that IDs 4 and 5 are either 'Yes' for multiple columns or are all 'No'. The response I have recorded in 'Type' for those two entries doesn't have to be 'Multiple' or 'Other' specifically - if there is a default way of recording the desired information that will work just as well. Thank you!

Answer 1

You could do

In [146]: s = df[['Brick', 'Vinyl', 'Stone']].eq('Yes')

In [147]: sm = s.sum(1)

In [148]: df['Type'] = np.where(sm.eq(0), 'Other', 
                                np.where(sm.eq(2), 'Multiple', s.idxmax(1)))

In [149]: df
Out[149]:
   ID Brick Vinyl Stone      Type
0   1   Yes    No    No     Brick
1   2    No   Yes    No     Vinyl
2   3    No    No   Yes     Stone
3   4   Yes   Yes    No  Multiple
4   5    No    No    No     Other

How do I create one column based on the names of multiple other columns in python?

Question

1 answers

solution1
3 ACCPTED 2018-01-16 17:16:46

How do I create one column based on the names of multiple other columns in python?

Question

1 answers

solution1 3 ACCPTED 2018-01-16 17:16:46

solution1
3 ACCPTED 2018-01-16 17:16:46