[英]How to choose the most voted category from multiple columns in R
I have a classification problem I need to solve using R, but to be sincere I have no clue on how to do it.我有一个需要使用 R 解决的分类问题,但老实说,我不知道该怎么做。
I have a table (see below) where different samples are classified by three ML models (one per column), and I need to choose the "most voted" category for each case and write it to a new column.我有一张表(见下文),其中不同的样本按三个 ML 模型(每列一个)分类,我需要为每个案例选择“投票最多”的类别并将其写入新列。
Current table当前表
Desired Output期望的输出
I have been reading about categorical variables in R, but anything seem to fit my specific needs.我一直在阅读 R 中的分类变量,但似乎任何东西都适合我的特定需求。
Any help would be highly appreciated.任何帮助将不胜感激。
Thanks in advance.提前致谢。
JL杰伦
This is not how you ask a question.这不是你提问的方式。 Please see the relevant thread, and in the future offer the data in the form shown below (using
dput()
and copy and paste the result from the console).请参阅相关线程,并在将来以如下所示的形式提供数据(使用
dput()
并从控制台复制并粘贴结果)。 At any rate here is a base R solution:无论如何,这里是一个基本的 R 解决方案:
# Calculate the modal values: mode => character vector
df1$mode <- apply(
df1[,colnames(df1) != "samples"],
1,
function(x){
head(
names(
sort(
table(x),
decreasing = TRUE
)
),
1
)
}
)
Data:数据:
df1 <- structure(list(samples = c("S1", "D4", "S2", "D1", "D2", "S3",
"D3", "S4"), RFpred = c("Carrier", "Absent", "Helper", "Helper",
"Carrier", "Absent", "Resistant", "Carrier"), SVMpred = c("Absent",
"Absent", "Helper", "Helper", "Carrier", "Helper", "Helper",
"Resistant"), KNNpred = c("Carrier", "Absent", "Carrier", "Helper",
"Carrier", "Absent", "Helper", "Resistant"), mode = c("Carrier",
"Absent", "Helper", "Helper", "Carrier", "Absent", "Helper",
"Resistant")), row.names = c(NA, -8L), class = "data.frame")
Tidyverse Approach: Tidyverse 方法:
library(dplyr)
library(tibble)
mode_char <- function(x) {
ux <- unique(na.omit(x))
ux[which.max(tabulate(match(x, ux)))]
}
df %>%
as_tibble() %>%
rowwise() %>%
mutate(
Vote = mode_char(c_across(RFpred:KNNpred))
)
#> # A tibble: 8 × 5
#> # Rowwise:
#> samples RFpred SVMpred KNNpred Vote
#> <chr> <chr> <chr> <chr> <chr>
#> 1 S1 Carrier Absent Carrier Carrier
#> 2 D4 Absent Absent Absent Absent
#> 3 S2 Helper Helper Carrier Helper
#> 4 D1 Helper Helper Helper Helper
#> 5 D2 Carrier Carrier Carrier Carrier
#> 6 S3 Absent Helper Absent Absent
#> 7 D3 Resistant Helper Helper Helper
#> 8 S4 Carrier Resistant Resistant Resistant
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.