簡體   English   中英

包含在該行中出現最多的字符串的新列

[英]New column containing string that appears the most in the row

我試圖用在行中出現最多的字符串創建一列,並用這個最流行的字符串出現的次數創建另一列。

為了方便我的問題,這是我試圖實現的目標:

我的實際DF

在此處輸入圖像描述

我想獲得什么:最流行的類別和數量

在此處輸入圖像描述

例子df:d

f <- data.frame(ID = 1:4,
           V1 = c("A","B","C","D"),
           V2 = c("A", "B","D","B"),
           V3 = c("A","C","D","B"))

這是另一種方式:

count <- sapply(apply(f[, -1], 1, table), max)
count
# [1] 3 2 2 2
category <- names(sapply(apply(f[, -1], 1, table), which.max))
category
# [1] "A" "B" "D" "B"
f2 <- data.frame(f, category, count)
f2
#   ID V1 V2 V3 category count
# 1  1  A  A  A        A     3
# 2  2  B  B  C        B     2
# 3  3  C  D  D        D     2
# 4  4  D  B  B        B     2
df <- data.frame(ID = 1:4,
                V1 = c("A","B","C","D"),
                V2 = c("A", "B","D","B"),
                V3 = c("A","C","D","B"))


library(data.table)
setDT(df)
other <- melt(df, id.vars = "ID", measure.vars = c("V1", "V2", "V3"))
other <- other[, .N, by = .(ID, value)]
colnames(other) <- c("ID", "category", "count")
other <- other[, .SD[which.max(count)], by = .(ID)]

res <- merge(df, other, by = c("ID"))
res
  • 我們可以使用dplyr按行rowwisetable應用於V1:V3的每一行
library(dplyr)

df |> rowwise() |> 
      mutate(category = names(table(c_across(V1:V3)))[which.max(table(c_across(V1:V3)))] ,
      count = max(table(c_across(V1:V3))))
  • Output
# A tibble: 4 × 6
# Rowwise: 
     ID V1    V2    V3    category count
  <int> <chr> <chr> <chr> <chr>    <int>
1     1 A     A     A     A            3
2     2 B     B     C     B            2
3     3 C     D     D     D            2
4     4 D     B     B     B            2

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM