简体   繁体   中英

python pandas - how to create for each row a list of column names with a condition?

I need apply a function to all rows of dataframe I have used this function that returns a list of column names if value is 1:

def find_column(x):  
    a=[]  
    for column in df.columns:  
        if (df.loc[x,column] == 1):  
            a = a + [column]
    return a

it works if i just insert the index, for example:

print(find_column(1))

but:

df['new_col'] = df.apply(find_column,axis=1)

does not work any idea? Thanks!

You can iterate by each row, so x is Series with index same like columns names, so is possible filter index matched data and convert to list:

df = pd.DataFrame({
        'A':list('abcdef'),
         'B':[4,1,4,5,5,1],
         'C':[7,1,9,4,2,3],
         'D':[1,1,5,7,1,1],
         'E':[5,1,6,9,1,4],
         'F':list('aaabbb')
})

def find_column(x):
    return x.index[x == 1].tolist()

df['new'] = df.apply(find_column,axis=1)
print (df)
   A  B  C  D  E  F           new
0  a  4  7  1  5  a           [D]
1  b  1  1  1  1  a  [B, C, D, E]
2  c  4  9  5  6  a            []
3  d  5  4  7  9  b            []
4  e  5  2  1  1  b        [D, E]
5  f  1  3  1  4  b        [B, D]

Another idea is use DataFrame.dot with mask by DataFrame.eq for equal, then remove last separator and use Series.str.split :

df['new'] = df.eq(1).dot(df.columns + ',').str.rstrip(',').str.split(',')
print (df)

   A  B  C  D  E  F           new
0  a  4  7  1  5  a           [D]
1  b  1  1  1  1  a  [B, C, D, E]
2  c  4  9  5  6  a            []
3  d  5  4  7  9  b            []
4  e  5  2  1  1  b        [D, E]
5  f  1  3  1  4  b        [B, D]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM