[英]ifelse in a mutate function in r
I am trying to add a column with a condition using the mutate function in r, but keep getting an error.我正在尝试使用 r 中的变异 function 添加具有条件的列,但不断出现错误。 The code is straight from the teacher's lecture, but yet an error occurs.
代码直接来自老师的讲座,但发生了错误。 The LineItem column is a factor class, I am not sure if that make a difference.
LineItem 列是一个因素 class,我不确定这是否会有所不同。 Please advice on what I am missing.
请就我所缺少的提出建议。
Thank you, Avi谢谢你,阿维
df <- read.csv('ities_short.csv')
colSums(is.na(df))
sl <- str_length(df$LineItem)
avg <- mean(str_length(df$LineItem))
df <- df %>% mutate(LineItem_LongName = ifelse(sl > avg), 1, 0)
Error in ifelse(sl > avg): argument "yes" is missing, with no default ifelse(sl > avg)中的错误:缺少参数“yes”,没有默认值
You have placed ')' at wrong places.你把')'放在了错误的地方。 The general syntax for ifelse is: ifelse(cond,value if true, value if false)
ifelse 的一般语法是: ifelse(cond,value if true, value if false)
df <- read.csv('ities_short.csv')
colSums(is.na(df))
sl <- str_length(df$LineItem)
avg <- mean(str_length(df$LineItem))
df <- df %>% mutate(LineItem_LongName = ifelse(sl > avg, 1, 0))
@Nirbhay Singh answer is correct. @Nirbhay Singh 的答案是正确的。 However, if you compare two vectors, it's generally better to use
dplyr::if_else
because it is stricter regarding NA
values:但是,如果您比较两个向量,通常最好使用
dplyr::if_else
因为它对NA
值更严格:
df <- df %>% mutate(LineItem_LongName = if_else(sl > avg, 1, 0))
Don't create separate objects and use it in dataframe, instead keep them in dataframe itself.不要创建单独的对象并在 dataframe 中使用它,而是将它们保存在 dataframe 本身中。 You can remove the columns later which you don't need.
您可以稍后删除不需要的列。 Moreover, you can do this without
ifelse
.此外,您可以在没有
ifelse
的情况下执行此操作。
library(dplyr)
library(stringr)
df %>%
mutate(temp = str_length(LineItem),
LineItem_LongName = as.integer(temp > mean(temp)))
Or in base R:或者在基础 R 中:
df$temp <- nchar(df$LineItem)
transform(df, LineItem_LongName = +(temp > mean(temp)))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.