[英]Nested ifelse statement with multiple columns
我有一個數據集,其中包括(除其他變量外)5列指示數據來自的國家/地區,編碼為數字。 我想創建一個新變量,以純文本表示國家(例如,西班牙而不是312)。
這是僅5行2列的數據示例,以實現可重復性:
c <- structure(list(CountryAP = structure(c(109, NA, 124, NA, NA), label = "Country of the Child Helpline (Asia Pacific region)", labels = c(Afghanistan = 109, `New Zealand` = 124), class = "haven_labelled"),
CountryEr = structure(c(NA, 313, NA, 287, 278), label = "Country of the Child Helpline (Europe region)", labels = c( Azerbaijan = 278, Finland = 287, Sweden = 313), class = "haven_labelled")), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -5L))
我想計算一個新變量(稱為Country ),其中所有國家/地區都來自變量CountryAP和CountryEr的數字 。
我試過這個:
c <- c %>% mutate(Country = ifelse(CountryAP == 109, 'Afghanistan', ifelse(CountryAP == 124, 'New Zealand', ifelse(CountryEr == 313, 'Sweden', ifelse(CountryEr == 287, 'Finland', ifelse(CountryEr == 278, 'Azerbaijan','N/A'))))))
但是,盡管它正確地計算了包含第一個變量(CountryAP)中的值的行,但它忽略了有關第二個變量(CountryEr)的信息,僅給我以下信息:
CountryAP CountryEr Country
1 109 NA Afghanistan
2 NA 313 NA
3 124 NA New Zealand
4 NA 287 NA
5 NA 278 NA
當我只運行CountryEr部件時,它可以正常運行。
任何想法如何使ifelse語句接受查看其他變量?
任何幫助將非常感激!
謝謝case_when確實解決了我的問題:
c <- c %>% mutate(Country = case_when(CountryAP == 109 ~ 'Afghanistan',
CountryAP == 124 ~ 'New Zealand',
CountryEr == 313 ~ 'Sweden',
CountryEr == 287 ~ 'Finland',
CountryEr == 278 ~ 'Azerbaijan'))
我可以想到兩種方法來做到這一點。 首先,您需要將國家/地區代碼統一到一個列中:
c <- c %>%
mutate(CountryCode = ifelse(is.na(CountryAP), CountryEr, CountryAP))
CountryAP CountryEr CountryCode
<dbl> <dbl> <dbl>
1 109 NA 109
2 NA 313 313
3 124 NA 124
4 NA 287 287
5 NA 278 278
dplyr::case_when
此函數使我們可以指定多個條件,而不會混淆嵌套結構:
c <- c %>%
mutate(CountryName = case_when(
CountryCode == 109 ~ 'Afghanistan',
CountryCode == 124 ~ 'New Zealand',
CountryCode == 313 ~ 'Sweden',
CountryCode == 287 ~ 'Finland',
CountryCode == 278 ~ 'Azerbaijan'
))
CountryAP CountryEr CountryCode CountryName
<dbl> <dbl> <dbl> <chr>
1 109 NA 109 Afghanistan
2 NA 313 313 Sweden
3 124 NA 124 New Zealand
4 NA 287 287 Finland
5 NA 278 278 Azerbaijan
或者,您可以將國家/地區代碼和國家/地區名稱值存儲在單獨的表中,然后將它們合並到您的主要數據中:
df.countries <- data.frame(
CountryCode = c(109, 124, 313, 287, 278),
CountryName = c('Afghanistan', 'New Zealand', 'Sweden', 'Finland', 'Azerbaijan')
)
CountryCode CountryName
1 109 Afghanistan
2 124 New Zealand
3 313 Sweden
4 287 Finland
5 278 Azerbaijan
c <- c %>%
left_join(df.countries, by = 'CountryCode')
CountryAP CountryEr CountryCode CountryName
<dbl> <dbl> <dbl> <chr>
1 109 NA 109 Afghanistan
2 NA 313 313 Sweden
3 124 NA 124 New Zealand
4 NA 287 287 Finland
5 NA 278 278 Azerbaijan
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.