簡體   English   中英

帶多列的嵌套ifelse語句

[英]Nested ifelse statement with multiple columns

我有一個數據集,其中包括(除其他變量外)5列指示數據來自的國家/地區,編碼為數字。 我想創建一個新變量,以純文本表示國家(例如,西班牙而不是312)。

這是僅5行2列的數據示例,以實現可重復性:

c <- structure(list(CountryAP = structure(c(109, NA, 124, NA, NA), label = "Country of the Child Helpline (Asia Pacific region)", labels = c(Afghanistan = 109,  `New Zealand` = 124), class = "haven_labelled"), 
           CountryEr = structure(c(NA, 313, NA, 287, 278), label = "Country of the Child Helpline (Europe region)", labels = c( Azerbaijan = 278, Finland = 287, Sweden = 313), class = "haven_labelled")), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -5L))

我想計算一個新變量(稱為Country ),其中所有國家/地區都來自變量CountryAPCountryEr的數字

我試過這個:

c <- c %>%   mutate(Country = ifelse(CountryAP == 109, 'Afghanistan', ifelse(CountryAP == 124, 'New Zealand', ifelse(CountryEr == 313, 'Sweden', ifelse(CountryEr == 287, 'Finland', ifelse(CountryEr == 278, 'Azerbaijan','N/A'))))))

但是,盡管它正確地計算了包含第一個變量(CountryAP)中的值的行,但它忽略了有關第二個變量(CountryEr)的信息,僅給我以下信息:

   CountryAP    CountryEr     Country
1  109          NA            Afghanistan
2  NA           313           NA
3  124          NA            New Zealand
4  NA           287           NA
5  NA           278           NA

當我只運行CountryEr部件時,它可以正常運行。

任何想法如何使ifelse語句接受查看其他變量?

任何幫助將非常感激!

謝謝case_when確實解決了我的問題:

c <- c %>%   mutate(Country = case_when(CountryAP == 109 ~ 'Afghanistan',
                         CountryAP == 124 ~  'New Zealand',
                         CountryEr == 313 ~ 'Sweden',
                         CountryEr == 287  ~ 'Finland',
                         CountryEr == 278 ~ 'Azerbaijan'))

我可以想到兩種方法來做到這一點。 首先,您需要將國家/地區代碼統一到一個列中:

c <- c %>% 
  mutate(CountryCode = ifelse(is.na(CountryAP), CountryEr, CountryAP))

  CountryAP CountryEr CountryCode
      <dbl>     <dbl>       <dbl>
1       109        NA         109
2        NA       313         313
3       124        NA         124
4        NA       287         287
5        NA       278         278

使用dplyr::case_when

此函數使我們可以指定多個條件,而不會混淆嵌套結構:

c <- c %>% 
  mutate(CountryName = case_when(
    CountryCode == 109 ~ 'Afghanistan',
    CountryCode == 124 ~ 'New Zealand',
    CountryCode == 313 ~ 'Sweden',
    CountryCode == 287 ~ 'Finland',
    CountryCode == 278 ~ 'Azerbaijan'
  ))

  CountryAP CountryEr CountryCode CountryName
      <dbl>     <dbl>       <dbl> <chr>      
1       109        NA         109 Afghanistan
2        NA       313         313 Sweden     
3       124        NA         124 New Zealand
4        NA       287         287 Finland    
5        NA       278         278 Azerbaijan 

合並二級表

或者,您可以將國家/地區代碼和國家/地區名稱值存儲在單獨的表中,然后將它們合並到您的主要數據中:

df.countries <- data.frame(
  CountryCode = c(109, 124, 313, 287, 278),
  CountryName = c('Afghanistan', 'New Zealand', 'Sweden', 'Finland', 'Azerbaijan')
)

  CountryCode CountryName
1         109 Afghanistan
2         124 New Zealand
3         313      Sweden
4         287     Finland
5         278  Azerbaijan

c <- c %>% 
  left_join(df.countries, by = 'CountryCode')

  CountryAP CountryEr CountryCode CountryName
      <dbl>     <dbl>       <dbl> <chr>      
1       109        NA         109 Afghanistan
2        NA       313         313 Sweden     
3       124        NA         124 New Zealand
4        NA       287         287 Finland    
5        NA       278         278 Azerbaijan 

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM