簡體   English   中英

將數據幀中的數據值重新編碼為 R 中的組合值

[英]Recode data values in a dataframe into combined values in R

我試圖比較婚姻狀況,我的變量名稱為“已婚”、“未婚”、“訂婚”、“單身”和“未婚”。 我如何使這些數據僅讀作“已婚”和“未婚”? (訂婚算作已婚,未婚,未婚算作未婚)

樣本數據集

data.frame(mstatus = sample(x = c("married", 
                                  "not married", 
                                  "engaged", 
                                  "single", 
                                  "not married"), 
                            size = 15, replace = TRUE))

這是我到目前為止

df2 <- df%>%mutate(
  mstatus = (tolower(mstatus))
)

您可以使用dplyr (tidyverse packge) 中的mutate()函數:

df <- df %>% dplyr::mutate(mstatus = case_when(
    mstatus == "married" | mstatus == "engaged"  ~ "married",
    mstatus == "not married" | mstatus == "single" ~ "not married"
))

我想最簡單的基本 R 方法是使用ifelse語句:

df2$mstatus_new <- ifelse(df2$mstatus=="engaged"|df2$mstatus=="married", "married", "not married")

數據:

df2 <- data.frame(
  mstatus = c("married", "not married", "engaged", "single", "nota married"))
df2
       mstatus
1      married
2  not married
3      engaged
4       single
5 nota married

結果:

df2
       mstatus mstatus_new
1      married     married
2  not married not married
3      engaged     married
4       single not married
5 nota married not married

如果我們需要重新編碼 'mstatus,一種選擇是forcats

library(dplyr)
library(forcats)
df2 %>%
      mutate(mstatus = fct_recode(mstatus, married = "engaged",
         `not married` = "single"))
#      mstatus
#1     married
#2 not married
#3     married
#4 not married
#5 not married

或者,如果有很多值要更改,請使用fct_collapse ,它可以采用值向量

df2 %>%
   mutate(mstatus = fct_collapse(mstatus, married = c('engaged'), 
         `not married` = c("single")))

數據

df2 <- structure(list(mstatus = structure(c(2L, 3L, 1L, 4L, 3L), .Label = c("engaged", 
"married", "not married", "single"), class = "factor")),
class = "data.frame", row.names = c(NA, 
-5L))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM