[英]Pandas: result column by aggregating entire row
I have a pandas dataframe containing tuples of booleans (real value, predicted value) and want to create new columns containing the amount of true/false positives/negatives.我有一个 pandas dataframe 包含布尔元组(实际值,预测值),并且想要创建包含真/假阳性/阴性数量的新列。
I know i could loop through the indices and set the column value for that index after looping through the entire row, but i believe that's a pandas anti-pattern.
我知道我可以遍历索引并在遍历整行后设置该索引的列值,但我相信这是 pandas 反模式。 Is there a cleaner and more efficient way to do this?
有没有更清洁、更有效的方法来做到这一点?
This seems to work fine:这似乎工作正常:
def count_false_positives(row):
count = 0
for el in df.columns:
if(row[el][0] and not row[el][1]):
count+=1
return count
df.false_positives = df.apply(lambda row: count_false_positives(row), axis=1)
Another alternative would be to check the whole dataframe for (True,False)
values and sum the amount of matches along the columns axis ( sum(axis=1)
).另一种选择是检查整个 dataframe 的
(True,False)
值,并对沿列轴 ( sum(axis=1)
) 的匹配量求和。
df['false_positives'] = df.apply(lambda x: x==(True,False)).sum(axis=1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.