簡體   English   中英

如何對因子水平進行分組?

[英]How to group factor levels?

我有一個包含足球位置縮寫的因子列,大約有 17 個唯一值和 220 個觀察值。 我只想擁有包含 17 個唯一值的三個因子級別。

levels(nfldraft$Pos) <- list(Linemen = c("C","OG","OT","TE","DT","DE"),
                             Small_Backs =  c("CB","WR","FS"), 
                             Big_Backs = c("FB","ILB","OLB","P","QB",
                                           "RB","SS","WR"))

是我嘗試過的,將nfldraft$Pos打印到控制台顯示 3 個因子級別,但所有值都是"Linemen""Small_Backs" ,所有其他值都是NA 我哪里錯了?

我用所有縮寫組成了一個示例字符向量:

my_example <- c("C","OG","OT","TE","DT","DE","CB","WR","FS", 
                "FB","ILB","OLB","P","QB","RB","SS","WR")
class(my_example)

[1]“性格”

然后我用他們的縮寫替換了所需的級別(您也可以在此處使用gsub或許多不同方法中的任何一種):

my_example[my_example %in% c("C","OG","OT","TE","DT","DE")] <- "Linemen"
my_example[my_example %in% c("CB","WR","FS")]               <- "Small Backs"
my_example[my_example %in% c("FB","ILB","OLB","P",
                             "QB","RB","SS","WR")]          <- "Big Backs"

然后我把它變成了一個因素:

my_example <- as.factor(my_example)
head(my_example)
 [1] Linemen Linemen Linemen Linemen Linemen Linemen Levels: Big Backs Linemen Small Backs
tail(my_example)
 [1] Big Backs Big Backs Big Backs Big Backs Big Backs Small Backs Levels: Big Backs Linemen Small Backs
class(my_example)

[1] “因素”

這是需要一個完全可重現的例子的一個很好的例子。 實際上 OP 的代碼看起來應該可以工作。 取自@Hack-R 的示例輸入:

my_example <- c("C","OG","OT","TE","DT","DE","CB","WR","FS", 
                "FB","ILB","OLB","P","QB","RB","SS","WR")

OP 的原始代碼按原樣工作:

nfldraft = list(Pos = factor(my_example))
levels(nfldraft$Pos) <- list(
  Linemen = c("C","OG","OT","TE","DT","DE"), 
  Small_Backs =  c("CB","WR","FS"), 
  Big_Backs = c("FB","ILB","OLB","P","QB","RB","SS","WR")
)
table(nfldraft$Pos)
#     Linemen Small_Backs   Big_Backs 
#           6           2           9 

這與有關如何使用levels<-的文檔完全一致:

levels(x) <- value

value一個有效值levels(x) ... 對於 factor 方法,一個字符串向量,其長度至少為 x 的級別數,或指定如何重命名級別的命名列表

所以看來OP的輸入還有其他問題

您還可以使用 dplyr 包中的mapvalues()函數。

在您的示例中,它將是:

Linemen_levels = c("C","OG","OT","TE","DT","DE")
Small_Backs_levels = c("CB","WR","FS")
Big_Backs_levels = c("FB","ILB","OLB","P","QB","RB","SS","WR")

nfldraft <- nfldraft %>% mutate(Pos=mapvalues(Pos, 
                 from = c(Linemen_levels, Small_Backs_levels, Big_Backs_levels),
                 to = c(rep('Linemen', length(Linemen_levels), rep('Small_Backs', length(Small_Backs_levels), rep('Big_Backs', length(Big_Backs_levels))))))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM