簡體   English   中英

如何折疊 R 中分類變量的水平

[英]How to collapse levels in a categorical variable in R

我有各種分類變量,每個變量都超過 5 個級別,我想要一個可以將它們折疊成兩個級別的函數

column1<- c("bad","good","nice","fair","great","bad","bad","good","nice",
            "fair","great","bad")
column2<- c("john","ben","cook","seth","brian","deph","omar","mary",
            "frank","boss","kate","sall")

df<- data.frame(column1,column2)

因此,對於上面的數據框,在 column1 中,我想使用一個函數將所有“壞”轉換為“壞”,將其他級別轉換為“其他”。 我不知道該怎么做。 謝謝

使用ifelsecase_when

library(dplyr)
df <- df %>% 
   mutate(column1 = case_when(column1 != "bad" ~ "others", TRUE ~ column1))

此外,由於只有一個變化,我們可以做

df$column1[df$column1 != "bad"] <- "others"

在 base R 中執行此操作的一種簡單方法是使用索引:

c('others', 'bad')[(df$column1 == 'bad') + 1]
#> [1] "bad"    "others" "others" "others" "others" "bad"    "bad"   
#> [8] "others" "others" "others" "others" "bad"  
df<- data.frame(factor=as.factor(column1),column2)
levels(df$factor)<-c("bad",rep("other",4))

這是帶分組的dplyr解決方案:

library(dplyr)
df %>% 
  group_by(group = cumsum(column1=="bad")) %>% 
  mutate(column1 = ifelse(row_number()==1, "bad", "others")) %>% 
  ungroup() %>% 
  select(-group)

  column1 column2
   <chr>   <chr>  
 1 bad     john   
 2 others  ben    
 3 others  cook   
 4 others  seth   
 5 others  brian  
 6 bad     deph   
 7 bad     omar   
 8 others  mary   
 9 others  frank  
10 others  boss   
11 others  kate   
12 bad     sall   

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM