简体   繁体   English

是否有用于嵌套 ifelse 函数的 R function ?

[英]Is there an R function for nested ifelse functions?

I am trying to combine many variables to make a dummy variable for whether somebody belongs to one occupation category, using ifelse.我正在尝试使用 ifelse 组合许多变量来确定某人是否属于一个职业类别的虚拟变量。 I was wondering if there was a function to simplify this code and make it easier to repeat going forward.我想知道是否有 function 来简化此代码并使其更容易重复。 For example, my code is currently:例如,我的代码目前是:

occupation_blue_collar <- ifelse(occupation=="Blue Collar", T, 
                          ifelse(occupation =="Blue Collar and Ex-Military", T, 
                          ifelse(occupation == "Blue Collar and Non-military Government", T,
                          ifelse(occupation== "Blue Collar and School Student", T,
                          ifelse(occupation== "Blue Collar and University Student", T,
                          ifelse(occupation== "Blue Collar and White Collar", T,       
                                        F))))))

I have to do this over many variables and many categories, so I was hoping there was a way to simplify.我必须对许多变量和许多类别进行此操作,所以我希望有一种方法可以简化。 Thanks!谢谢!

You could simplify your ifelse statement by using stringr::str_detect in your test expression -您可以通过在测试表达式中使用stringr::str_detect来简化ifelse语句 -

ifelse(str_detect(occupation, “Blue Collar”, TRUE, FALSE))

If you have many variables then dplyr::case_when would be better -如果您有很多变量,那么dplyr::case_when会更好 -

case_when(str_detect(occupation, “Blue Collar”) ~ TRUE,
          str_detect(occupation, “White Collar) ~ TRUE,
          TRUE ~ FALSE)

See case_when which should meet your needs请参阅case_when哪个应该满足您的需求

library(dplyr)
mtcars %>% 
  mutate(cg = case_when(carb <= 3 ~ "low",
                        carb > 3  ~ "high"))

occupation=="Blue Collar" | occupation =="Blue Collar and Ex-Military" |...

Where |地点| is the "or" operator.是“或”运算符。

Since you seem to have a lot of repeating words (ie "Blue Collar") you should look into regex to see if you can automate some of this repetition.由于您似乎有很多重复的单词(即“Blue Collar”),您应该查看正则表达式,看看您是否可以自动化其中的一些重复。

Few different ways to address this concern, but I think the easiest way to do this is to just use OR logic in your ifelse statement解决这个问题的方法很少,但我认为最简单的方法是在 ifelse 语句中使用 OR 逻辑

ifelse(occupation == "Blue Collar" | occupation == "Blue Collar and Ex-Military" | occupation == "Blue Collar and Non-military Government" | occupation == "Blue Collar and School Student" | occupation == "Blue Collar and University Student", "T", "F")

However, if you have to do this many times across occupations, there is an even better way to do this.但是,如果您必须跨职业多次执行此操作,则有更好的方法来执行此操作。 I would create a csv with the header of occupations_blue_collar, and fill the column with all the possible columns you want.我将创建一个 csv 和 header 的职业蓝领,并用你想要的所有可能的列填充该列。 Then read in the csv, and use ifelse(occupation %in% df$occupations_blue_collar, "T", "F") .然后读入 csv,并使用ifelse(occupation %in% df$occupations_blue_collar, "T", "F") Rinse and repeat for your other occupations!冲洗并重复您的其他职业!

Edit: as @markus has pointed out, if all of the values you want to be in occupations_blue_collar have the words 'blue collar' in them, then ifelse(grepl("blue collar", occupation), 'T', 'F') would be the most efficient way to process this, To filter out 'blue collar and white collar', you can use ifelse(grepl('blue collar', occupation) & occupation,= 'blue collar and white collar', 't', 'f') or ifelse(grepl('blue collar', occupation) &,grepl('white collar', occupation), 't', 'f')`编辑:正如@markus 所指出的,如果你想在职业_blue_collar 中包含的所有值都包含“蓝领”字样,那么ifelse(grepl("blue collar", occupation), 'T', 'F')将是处理此问题的最有效方法,要过滤掉“蓝领和白领”,您可以使用ifelse(grepl('blue collar', occupation) & occupation,= 'blue collar and white collar', 't', 'f')或 ifelse(grepl('蓝领', 职业) &,grepl('白领', 职业), 't', 'f')`

Edit 2: changing ||编辑2:改变|| to |到 | as @wusel suggested.正如@wusel 建议的那样。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM