简体   繁体   English

如何在R中的两列之间查找精度

[英]How to find accuracy between two columns in R

I have two columns: one for predicted value and another for true value. 我有两栏:一栏代表预测值,另一栏代表真实值。
I want to calculate the accuracy between these columns and make missing values count as true if both columns have missing values. 我想计算这些列之间的精度,如果两个列都有缺失值,则使缺失值计数为true。
So when 所以什么时候

Pred True
1     2
2     2
NA    NA
3     2

The accuracy would be 50%. 准确性将是50%。
Also, how should I do the same thing with character values? 另外,我该如何对字符值做同样的事情?

You can do: 你可以做:

pred <- c(1,2,NA,3)
true <- c(2,2,NA,2)
(sum(pred==true, na.rm=T) + sum(is.na(pred) & is.na(true))) / length(pred)

That is, add the number of times where pred and true are equal sum(pred==true, na.rm=T) together with the number of times they are both NA values sum(is.na(pred) & is.na(true)) . 即,将predtrue等于sum(pred==true, na.rm=T)的次数加上它们都是NA值的次数sum(is.na(pred) & is.na(true)) Divide by the vector length. 除以向量长度。

You could do something like this: 您可以执行以下操作:

sum(data$Pred == data$True, na.rm = T) / nrow(data) *100

to get accuracy the way you defined it. 以您定义它的方式获得准确性。 It will work for integers and strings. 它适用于整数和字符串。 Problem is, you can't really compare NAs, so if both columns have NAs for a given row and you actually consider that to be an accurate prediction, you would need to count those instances separately. 问题是,您无法真正比​​较NA,因此,如果两列都具有给定行的NA,而您实际上认为这是一个准确的预测,则需要分别计算这些实例。 For instance, you can just get the union of the indices where both columns are NA, and add that to the sum: 例如,您可以只获取两列均为NA的索引的并集,并将其添加到总和中:

s <- sum(data$Pred == data$True, na.rm = T)
na <- length(union(which(is.na(data$Pred)), which(is.na(data$True))))
(s + na) / nrow(data) * 100

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM