[英]R: Update Column Based on Text Condition from Another Column
I would like to make a new column in my data frame by using a conditional statement that would say "If Column_y contains Column_x then 1 else 0"我想通过使用条件语句在我的数据框中创建一个新列,该语句会说“如果 Column_y 包含 Column_x 然后 1 else 0”
For example:例如:
Event Name Winner Loser New Column
1 James James,Bob John,Steve 1
1 Bob James,Bob John,Steve 1
1 John James,Bob John,Steve 0
1 Steve James,Bob John,Steve 0
I want to have New Column<- "If Winner contains Name then 1 else 0"我想要新列<-“如果获胜者包含名称,则为 1,否则为 0”
Keep in mind this is for 100,000 rows and probably 700 unique names.请记住,这适用于 100,000 行,可能有 700 个唯一名称。 When I try things like
当我尝试像
df$NewColumn<-ifelse(grepl(df$Name,df$Winner)==TRUE,1,0)
or variations I get the "pattern has a length > 1" error.或变体我得到“模式的长度 > 1”错误。
I think you just want to compare the Name
column against the Winner
column:我认为您只想将
Name
列与Winner
列进行比较:
df$NewColumn <- ifelse(df$Name == df$Winner, 1, 0)
Note that because df$Name == df$Winner
is actually a boolean expression, you might also be able to simplify to:请注意,因为
df$Name == df$Winner
实际上是 boolean 表达式,您也可以简化为:
df$NewColumn <- df$Name == df$Winner
In your example, exact string matching works.在您的示例中,精确的字符串匹配有效。 But I am assuming it does not hold true for your entire data.
但我假设它不适用于您的整个数据。
Implementing the contains condition would be something like this:实现包含条件将是这样的:
library(dplyr)
library(purrr)
df = df %>%
dplyr::mutate(NewColumn = purrr::map2_dbl(.x=Winner,.y=Name,~ifelse(grepl(.y,.x),1,0)))
Adding an alternate solution with stringr
:使用
stringr
添加替代解决方案:
df = df %>%
dplyr::mutate(NewColumn=ifelse(str_detect(Winner,Name),1,0))
Let me know if this works.让我知道这个是否奏效。
PS: str_detect
is faster. PS:
str_detect
更快。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.