简体   繁体   中英

How to choose the most voted category from multiple columns in R

I have a classification problem I need to solve using R, but to be sincere I have no clue on how to do it.

I have a table (see below) where different samples are classified by three ML models (one per column), and I need to choose the "most voted" category for each case and write it to a new column.

Current table

在此处输入图像描述

Desired Output

在此处输入图像描述

I have been reading about categorical variables in R, but anything seem to fit my specific needs.

Any help would be highly appreciated.

Thanks in advance.

JL

This is not how you ask a question. Please see the relevant thread, and in the future offer the data in the form shown below (using dput() and copy and paste the result from the console). At any rate here is a base R solution:

# Calculate the modal values: mode => character vector
df1$mode <- apply(
  df1[,colnames(df1) != "samples"],
  1,
  function(x){
    head(
      names(
        sort(
          table(x), 
          decreasing = TRUE
        )
      ),
     1
    )
  }
)

Data:

df1 <- structure(list(samples = c("S1", "D4", "S2", "D1", "D2", "S3", 
"D3", "S4"), RFpred = c("Carrier", "Absent", "Helper", "Helper", 
"Carrier", "Absent", "Resistant", "Carrier"), SVMpred = c("Absent", 
"Absent", "Helper", "Helper", "Carrier", "Helper", "Helper", 
"Resistant"), KNNpred = c("Carrier", "Absent", "Carrier", "Helper", 
"Carrier", "Absent", "Helper", "Resistant"), mode = c("Carrier", 
"Absent", "Helper", "Helper", "Carrier", "Absent", "Helper", 
"Resistant")), row.names = c(NA, -8L), class = "data.frame")

Tidyverse Approach:

library(dplyr)
library(tibble)

mode_char <- function(x) {
    ux <- unique(na.omit(x))
    ux[which.max(tabulate(match(x, ux)))]
}

df %>%
    as_tibble() %>%
    rowwise() %>%
    mutate(
        Vote = mode_char(c_across(RFpred:KNNpred))
    )

#> # A tibble: 8 × 5
#> # Rowwise: 
#>   samples RFpred    SVMpred   KNNpred   Vote     
#>   <chr>   <chr>     <chr>     <chr>     <chr>    
#> 1 S1      Carrier   Absent    Carrier   Carrier  
#> 2 D4      Absent    Absent    Absent    Absent   
#> 3 S2      Helper    Helper    Carrier   Helper   
#> 4 D1      Helper    Helper    Helper    Helper   
#> 5 D2      Carrier   Carrier   Carrier   Carrier  
#> 6 S3      Absent    Helper    Absent    Absent   
#> 7 D3      Resistant Helper    Helper    Helper   
#> 8 S4      Carrier   Resistant Resistant Resistant

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM