[英]Recode data values in a dataframe into combined values in R
我試圖比較婚姻狀況,我的變量名稱為“已婚”、“未婚”、“訂婚”、“單身”和“未婚”。 我如何使這些數據僅讀作“已婚”和“未婚”? (訂婚算作已婚,未婚,未婚算作未婚)
樣本數據集
data.frame(mstatus = sample(x = c("married",
"not married",
"engaged",
"single",
"not married"),
size = 15, replace = TRUE))
這是我到目前為止
df2 <- df%>%mutate(
mstatus = (tolower(mstatus))
)
您可以使用dplyr
(tidyverse packge) 中的mutate()
函數:
df <- df %>% dplyr::mutate(mstatus = case_when(
mstatus == "married" | mstatus == "engaged" ~ "married",
mstatus == "not married" | mstatus == "single" ~ "not married"
))
我想最簡單的基本 R 方法是使用ifelse
語句:
df2$mstatus_new <- ifelse(df2$mstatus=="engaged"|df2$mstatus=="married", "married", "not married")
數據:
df2 <- data.frame(
mstatus = c("married", "not married", "engaged", "single", "nota married"))
df2
mstatus
1 married
2 not married
3 engaged
4 single
5 nota married
結果:
df2
mstatus mstatus_new
1 married married
2 not married not married
3 engaged married
4 single not married
5 nota married not married
如果我們需要重新編碼 'mstatus,一種選擇是forcats
library(dplyr)
library(forcats)
df2 %>%
mutate(mstatus = fct_recode(mstatus, married = "engaged",
`not married` = "single"))
# mstatus
#1 married
#2 not married
#3 married
#4 not married
#5 not married
或者,如果有很多值要更改,請使用fct_collapse
,它可以采用值向量
df2 %>%
mutate(mstatus = fct_collapse(mstatus, married = c('engaged'),
`not married` = c("single")))
df2 <- structure(list(mstatus = structure(c(2L, 3L, 1L, 4L, 3L), .Label = c("engaged",
"married", "not married", "single"), class = "factor")),
class = "data.frame", row.names = c(NA,
-5L))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.