[英]Is there a way to add new columns to R based on conditions to
Currently using R in Azure. 当前在Azure中使用R。 I'm trying to create a new column within my dataframe whose values are dependent on an exisiting column("Sum of Pillar".
我试图在我的数据框中创建一个新列,其值取决于现有列(“支柱总和”。
->WithSumIDAPillars <- maml.mapInputPort(1) -> WithSumIDAPillars <-maml.mapInputPort(1)
->WithSumIDAPillars["newcolumn"] <- NA -> WithSumIDAPillars [“ newcolumn”] <-不适用
->WithSumIDAPillars$newcolumn <- if (WithSumIDAPillars$Sum of Pillar <5 ="Low";WithSumIDAPillars$Sum of Pillar <=6<=10 ="Medium";WithSumIDAPillars$Sum of Pillar <=11<=16 ="High" -> WithSumIDAPillars $ newcolumn <-if(WithSumIDAPillars $支柱的总和<5 =“低”; WithSumIDAPillars $支柱的总和<= 6 <= 10 =“ Medium”; WithSumIDAPillars $支柱的总和<= 11 <= 16 =“高”
I need to create a new column that would set the following requirements: If "Sum of PIllar" value is between 0-5=Low, 6-11=Medium and 11-16=High. 我需要创建一个新列来设置以下要求:如果“ PIllar的总和”值介于0-5 =低,6-11 =中和11-16 =高之间。
Have you used the dplyr package? 您是否使用过dplyr软件包? Would something like this work?
这样的事情行吗?
library("dplyr")
WithSumIDAPillars$newcolumn <-
case_when(
WithSumIDAPillars$`Sum of Pillar` <= 6 ~ "Low",
WithSumIDAPillars$`Sum of Pillar` <= 11 ~ "Medium",
WithSumIDAPillars$`Sum of Pillar` <= 16 ~ "High",
TRUE ~ NA_character_
)
The case_when()
function goes through each case sequentially for until one of the expressions on the left side of the ~
evaluates to TRUE
, so the last statement is used as a default value. case_when()
函数按顺序处理每种情况,直到~
左侧的表达式之一的值为TRUE
,因此将最后一个语句用作默认值。
Depending on your application, it may make things easier to name your column sum_of_pillar
, using underscores. 根据您的应用程序,使用下划线可以更轻松地为列
sum_of_pillar
命名。 That would make it easier to use the pipe ( %>%
) and the mutate()
function to write things a little more concisely: 这样可以更方便地使用管道(
%>%
)和mutate()
函数来更简洁地编写内容:
WithSumIDAPillars <-
WithSumIDAPillars %>%
mutate(
newcolumn = case_when(
sum_of_pillar <= 5 ~ "Low",
sum_of_pillar <= 11 ~ "Medium",
sum_of_pillar <= 16 ~ "High",
TRUE ~ NA_character_
)
)
To learn more about dplyr, you can visit the website: https://dplyr.tidyverse.org/ or the (free) R for Data Science Book: https://r4ds.had.co.nz/ 要了解更多关于dplyr,您可以访问网站: https://dplyr.tidyverse.org/或数据科学的书(免费)R: https://r4ds.had.co.nz/
Hope this helps! 希望这可以帮助!
An alternative, perhaps less elegant method to case_when
is using nested if_else
statements. 对于
case_when
,使用嵌套的if_else
语句是另一种可能不太优雅的方法。 Maybe the one advantage is you don't have to may too much attention to the order or the statements as you do with case_when
. 也许一个好处是,您不必像对
case_when
那样过多地关注顺序或语句。
library(tidyverse)
WithSumIDAPillars %>%
mutate(new_col = if_else(`Sum of the Pillar` >= 0 & <= 5, "Low",
if_else(`Sum of the Pillar` >= 6 & <= 11, "Medium",
if_else(`Sum of the Pillar` >= 12 & <= 18, "High",
NA))))
NB - there's an overlap between your upper Medium and lower High thresholds so I upped the lower boundary for High to 12. 注意:您的中上限和下上限之间存在重叠,因此我将下限的上限提高到了12。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.