[英]Pandas map 0/1 data frame entries to column names
I have a data frame in pandas with entries that are either 0 or 1. I would like to reduce this to a single list of strings that are the result from concatentating column names wherever there is a 1. 我在pandas中有一个数据帧,其条目为0或1。我想将其简化为一个字符串列表,该字符串列表是由在有1的地方并入列名称而得到的。
For a toy example suppose my data frame is 对于一个玩具示例,假设我的数据帧是
V1 V2 V3
0 1 1
1 1 0
0 0 0
I would like to have a final result that looks like 我想要一个最终的结果看起来像
"V2,V3"
"V1,V2"
""
I had initially tried using something along the lines of 我最初尝试使用类似于
my_df.apply(lambda x: colnames[x])
thinking it would behave similarly to how numpy handles boolean indices. 认为它的行为类似于numpy处理布尔值索引的方式。 But, did not achieve what I wanted to do.
但是,没有实现我想做的。 How should I best accomplish this?
我应该如何最好地做到这一点?
convert the dtype of the df to a bool
, then call apply
and use the boolean mask to mask the columns, you need to pass param axis=1
to apply
the column mask row-wise: 将df的dtype转换为
bool
,然后调用apply
并使用布尔型掩码对列进行掩码,您需要传递param axis=1
以逐行apply
列掩码:
In [47]:
df.astype(bool).apply(lambda x: ','.join(df.columns[x]), axis=1)
Out[47]:
0 V2,V3
1 V1,V2
2
dtype: object
Your code my_df.apply(lambda x: colnames[x])
won't work because firstly when calling apply
on a df without specifying the axis
will call the lambda on each column in turn, secondly the 1/0
will interpret this as an index value rather than a boolean flag. 您的代码
my_df.apply(lambda x: colnames[x])
将不起作用,因为首先在未指定axis
情况下在df上调用apply
,会依次在每一列上调用lambda,其次1/0
会将其解释为索引值而不是布尔标志。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.