简体   繁体   English

为 r 中的 or 语句折叠向量中的字符串三次

[英]collapse strings in a vector three times for an or statement in r

I have a vector with multiple strings我有一个带有多个字符串的向量

strings <- c("CD4","CD8A")

and I'd like to output an OR statement to be passed to grep like so我想像这样输出一个 OR 语句传递给 grep


and so on for each element in the vector..对向量中的每个元素依此类推..

basically I'm trying to find an exact word in a string that has three dashes in it, (I don't want grep(CD4, ..) to return strings with CD40).基本上,我试图在包含三个破折号的字符串中找到一个确切的单词(我不希望grep(CD4, ..)返回带有 CD40 的字符串)。 This is how I thought of doing it but I'm open to other suggestions这就是我的想法,但我愿意接受其他建议

part of my data.frame looks like this:我的 data.frame 的一部分看起来像这样:

Genes <- as.data.frame(c("CD4-MyD88-IL27RA", "IL2RG-CD4-GHR","MyD88-CD8B-EPOR", "CD8A-IL3RA-CSF3R", "ICOS-CD40-LMP1"))
colnames(Genes) <- "Genes"

Here is a one-liner...这是一个单线...


[1] "CD4-MyD88-IL27RA" "IL2RG-CD4-GHR"    "CD8A-IL3RA-CSF3R"

It uses word-boundary markers \\\\b to make sure that it matches complete substrings (as the - does not count as part of a word).它使用单词边界标记\\\\b来确保它匹配完整的子字符串(因为-不计为单词的一部分)。

I don't know if I understood.不知道有没有看懂。 If I got it, the following command will return what you want如果我得到它,下面的命令将返回你想要的

stringr::str_split(Genes$Genes, pattern = '-') %>% 
    function(data) {
      data[stringr::str_which(data, pattern = '^CD')]
  )  %>% unlist

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM