[英]R Column Check if Contains Value from Another Column
Is there a way in R to check whether a value in one column contains a value within another column? R 中有没有办法检查一列中的值是否包含另一列中的值? In the below example, I am trying to see whether values in col2 are contained within the values in col1 (independently within each row) but getting a warning message: "argument 'pattern' has length > 1 and only the first element will be used".
在下面的示例中,我试图查看 col2 中的值是否包含在 col1 中的值中(独立地包含在每一行中)但收到一条警告消息:“argument 'pattern' has length > 1 and only the first element will be used ”。 Flag column should show "Yes" for the first/last row and "No" for the 2nd and 3rd rows.
标志列的第一行/最后一行应显示“是”,第二行和第三行应显示“否”。 Any thoughts on how to resolve would be greatly appreciate.
任何关于如何解决的想法将不胜感激。
col1 <- c("R.S.U.L.C","S.I.W","P.U.E","A.E.N")
col2 <- c("R","U","I","N")
df2 <- data.frame(col1,col2)
df2$Flag <- ifelse(grepl(df2$col2,df2$col1),"Yes","No")
This can be done with a combination of sapply/grepl
.这可以通过
sapply/grepl
的组合来完成。 Loop along df2$col
and grepl
it in string df$col1
.沿着
df2$col
循环并在字符串df$col1
中对其进行grepl
。
The one-liner is obvious.单行是显而易见的。
i <- sapply(seq_along(df2$col2), function(i) grepl(df2$col2[i], df2$col1[i]))
df2$Flag <- c("No", "Yes")[i + 1L]
df2
# col1 col2 Flag
#1 R.S.U.L.C R Yes
#2 S.I.W U No
#3 P.U.E I No
#4 A.E.N N Yes
df2$flag <- mapply(grepl, df2$col2, df2$col1)
grepl()
's pattern argument only uses the first element: grepl()
的模式参数只使用第一个元素:
See ?grepl
:见
?grepl
:
If a character vector of length 2 or more is supplied, the first element is used with a warning.
如果提供长度为 2 或更大的字符向量,则使用第一个元素并发出警告。
We can use str_detect
which is vectorized for both pattern and string我们可以使用
str_detect
,它对模式和字符串都进行了矢量化
library(dplyr)
library(stringr)
df2 <- df2 %>%
mutate(Flag = c('No', 'Yes')[1+str_detect(col1, as.character(col2))])
df2
# col1 col2 Flag
#1 R.S.U.L.C R Yes
#2 S.I.W U No
#3 P.U.E I No
#4 A.E.N N Yes
A tidy implementation of str_detect
, using ifelse
.使用
ifelse
的str_detect
的整洁实现。 Note that the use of fixed()
ensures literal content matching.请注意,使用
fixed()
可确保文字内容匹配。 Otherwise, str_detect
defaults to regex which can cause unexpected behaviour if the pattern column contains characters that are interpretable as regular expressions.否则,
str_detect
默认为正则表达式,如果模式列包含可解释为正则表达式的字符,这可能会导致意外行为。
library(tidyverse)
df2 <- df2 %>%
mutate(Flag = ifelse(str_detect(col1, fixed(as.character(col2))), "Yes", "No"))
df2
# col1 col2 Flag
#1 R.S.U.L.C R Yes
#2 S.I.W U No
#3 P.U.E I No
#4 A.E.N N Yes
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.