简体   繁体   English

有没有一种方法可以根据条件向R添加新列

[英]Is there a way to add new columns to R based on conditions to

Currently using R in Azure. 当前在Azure中使用R。 I'm trying to create a new column within my dataframe whose values are dependent on an exisiting column("Sum of Pillar". 我试图在我的数据框中创建一个新列,其值取决于现有列(“支柱总和”。

->WithSumIDAPillars <- maml.mapInputPort(1) -> WithSumIDAPillars <-maml.mapInputPort(1)

->WithSumIDAPillars["newcolumn"] <- NA -> WithSumIDAPillars [“ newcolumn”] <-不适用

->WithSumIDAPillars$newcolumn <- if (WithSumIDAPillars$Sum of Pillar <5 ="Low";WithSumIDAPillars$Sum of Pillar <=6<=10 ="Medium";WithSumIDAPillars$Sum of Pillar <=11<=16 ="High" -> WithSumIDAPillars $ newcolumn <-if(WithSumIDAPillars $支柱的总和<5 =“低”; WithSumIDAPillars $支柱的总和<= 6 <= 10 =“ Medium”; WithSumIDAPillars $支柱的总和<= 11 <= 16 =“高”

I need to create a new column that would set the following requirements: If "Sum of PIllar" value is between 0-5=Low, 6-11=Medium and 11-16=High. 我需要创建一个新列来设置以下要求:如果“ PIllar的总和”值介于0-5 =低,6-11 =中和11-16 =高之间。

在此处输入图片说明

Have you used the dplyr package? 您是否使用过dplyr软件包? Would something like this work? 这样的事情行吗?

library("dplyr")

WithSumIDAPillars$newcolumn <- 
  case_when(
    WithSumIDAPillars$`Sum of Pillar` <= 6 ~ "Low",
    WithSumIDAPillars$`Sum of Pillar` <= 11 ~ "Medium",
    WithSumIDAPillars$`Sum of Pillar` <= 16 ~ "High",
    TRUE ~ NA_character_
  )

The case_when() function goes through each case sequentially for until one of the expressions on the left side of the ~ evaluates to TRUE , so the last statement is used as a default value. case_when()函数按顺序处理每种情况,直到~左侧的表达式之一的值为TRUE ,因此将最后一个语句用作默认值。

Depending on your application, it may make things easier to name your column sum_of_pillar , using underscores. 根据您的应用程序,使用下划线可以更轻松地为列sum_of_pillar命名。 That would make it easier to use the pipe ( %>% ) and the mutate() function to write things a little more concisely: 这样可以更方便地使用管道( %>% )和mutate()函数来更简洁地编写内容:

WithSumIDAPillars <- 
  WithSumIDAPillars %>%
  mutate(
    newcolumn = case_when(
      sum_of_pillar <=  5 ~ "Low",
      sum_of_pillar <= 11 ~ "Medium",
      sum_of_pillar <= 16 ~ "High",
      TRUE ~ NA_character_
    )
  )

To learn more about dplyr, you can visit the website: https://dplyr.tidyverse.org/ or the (free) R for Data Science Book: https://r4ds.had.co.nz/ 要了解更多关于dplyr,您可以访问网站: https://dplyr.tidyverse.org/或数据科学的书(免费)R: https://r4ds.had.co.nz/

Hope this helps! 希望这可以帮助!

An alternative, perhaps less elegant method to case_when is using nested if_else statements. 对于case_when ,使用嵌套的if_else语句是另一种可能不太优雅的方法。 Maybe the one advantage is you don't have to may too much attention to the order or the statements as you do with case_when . 也许一个好处是,您不必像对case_when那样过多地关注顺序或语句。

library(tidyverse)

WithSumIDAPillars %>%
    mutate(new_col = if_else(`Sum of the Pillar` >= 0 & <= 5, "Low",
                             if_else(`Sum of the Pillar` >= 6 & <= 11, "Medium",
                                     if_else(`Sum of the Pillar` >= 12 & <= 18, "High",
                                             NA))))

NB - there's an overlap between your upper Medium and lower High thresholds so I upped the lower boundary for High to 12. 注意:您的中上限和下上限之间存在重叠,因此我将下限的上限提高到了12。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM