简体   繁体   English

熊猫将0/1数据框条目映射到列名

[英]Pandas map 0/1 data frame entries to column names

I have a data frame in pandas with entries that are either 0 or 1. I would like to reduce this to a single list of strings that are the result from concatentating column names wherever there is a 1. 我在pandas中有一个数据帧,其条目为0或1。我想将其简化为一个字符串列表,该字符串列表是由在有1的地方并入列名称而得到的。

For a toy example suppose my data frame is 对于一个玩具示例,假设我的数据帧是

   V1 V2 V3
   0  1  1
   1  1  0
   0  0  0

I would like to have a final result that looks like 我想要一个最终的结果看起来像

"V2,V3"
"V1,V2"
""

I had initially tried using something along the lines of 我最初尝试使用类似于

my_df.apply(lambda x: colnames[x])

thinking it would behave similarly to how numpy handles boolean indices. 认为它的行为类似于numpy处理布尔值索引的方式。 But, did not achieve what I wanted to do. 但是,没有实现我想做的。 How should I best accomplish this? 我应该如何最好地做到这一点?

convert the dtype of the df to a bool , then call apply and use the boolean mask to mask the columns, you need to pass param axis=1 to apply the column mask row-wise: 将df的dtype转换为bool ,然后调用apply并使用布尔型掩码对列进行掩码,您需要传递param axis=1以逐行apply列掩码:

In [47]:
df.astype(bool).apply(lambda x: ','.join(df.columns[x]), axis=1)

Out[47]:
0    V2,V3
1    V1,V2
2         
dtype: object

Your code my_df.apply(lambda x: colnames[x]) won't work because firstly when calling apply on a df without specifying the axis will call the lambda on each column in turn, secondly the 1/0 will interpret this as an index value rather than a boolean flag. 您的代码my_df.apply(lambda x: colnames[x])将不起作用,因为首先在未指定axis情况下在df上调用apply ,会依次在每一列上调用lambda,其次1/0会将其解释为索引值而不是布尔标志。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM