[英]collapse strings in a vector three times for an or statement in r
I have a vector with multiple strings我有一个带有多个字符串的向量
strings <- c("CD4","CD8A")
and I'd like to output an OR statement to be passed to grep like so我想像这样输出一个 OR 语句传递给 grep
"CD4-|-CD4-|-CD4$|CD8A-|-CD8A-|-CD8A$"
and so on for each element in the vector..对向量中的每个元素依此类推..
basically I'm trying to find an exact word in a string that has three dashes in it, (I don't want grep(CD4, ..)
to return strings with CD40).基本上,我试图在包含三个破折号的字符串中找到一个确切的单词(我不希望
grep(CD4, ..)
返回带有 CD40 的字符串)。 This is how I thought of doing it but I'm open to other suggestions这就是我的想法,但我愿意接受其他建议
part of my data.frame looks like this:我的 data.frame 的一部分看起来像这样:
Genes <- as.data.frame(c("CD4-MyD88-IL27RA", "IL2RG-CD4-GHR","MyD88-CD8B-EPOR", "CD8A-IL3RA-CSF3R", "ICOS-CD40-LMP1"))
colnames(Genes) <- "Genes"
Here is a one-liner...这是一个单线...
Genes$Genes[grep(paste0("\\b",strings,"\\b",collapse="|"),Genes$Genes)]
[1] "CD4-MyD88-IL27RA" "IL2RG-CD4-GHR" "CD8A-IL3RA-CSF3R"
It uses word-boundary markers \\\\b
to make sure that it matches complete substrings (as the -
does not count as part of a word).它使用单词边界标记
\\\\b
来确保它匹配完整的子字符串(因为-
不计为单词的一部分)。
I don't know if I understood.不知道有没有看懂。 If I got it, the following command will return what you want
如果我得到它,下面的命令将返回你想要的
stringr::str_split(Genes$Genes, pattern = '-') %>%
purrr::map(
function(data) {
data[stringr::str_which(data, pattern = '^CD')]
}
) %>% unlist
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.