简体   繁体   English

从字符串中提取特定单词

[英]extract specific words from a string

I need to extract all the single words that appear after the "join" word in the below string in R. 我需要提取出现在R中以下字符串中“ join”字之后的所有单个字。

db<- c("select *
        FROM a
        left join bd on bd.id=a.id
        left join ca on ca.id=a.id
        left join dc on dc.id=a.id
        where a.names != NULL")

My Result should be "bd" "ca" "dc" 我的结果应该是“ bd”“ ca”“ dc”

Is there any best possible approach for above mentioned query. 有没有上述查询的最佳方法。

Use stringr::str_match_all(.) to match your string to a regular expression. 使用stringr::str_match_all(.)将字符串与正则表达式匹配。 You can use a feature of regular expression called "capture groups". 您可以使用称为“捕获组”的正则表达式功能。 That means, you can group parts of the regular expression using ( and ) . 这意味着,您可以使用()将正则表达式的部分分组。 Their contents then show up separately in the results: 然后,它们的内容将分别显示在结果中:

library(stringr)
res <- str_match_all(db, "join ([a-z]+)")
res
[[1]]
     [,1]      [,2]
[1,] "join bd" "bd"
[2,] "join ca" "ca"
[3,] "join dc" "dc"

You see that the result is a list of one element (can be seen by the [[1]] output part or if you use str(res) ). 您会看到结果是一个元素的列表(可以通过[[1]]输出部分看到,或者如果使用str(res)可以看到)。 If you would have provided a vector of strings instead of only one string, this list would have more elements. 如果您将提供一个字符串向量而不是一个字符串,则此列表将包含更多元素。 Inside each list element, there is a matrix of strings. 在每个列表元素内,都有一个字符串矩阵。 Each row of the matrix is one place of the input string where the regular expression matched. 矩阵的每一行是输入字符串中正则表达式匹配的一个位置。 Each column of the matrix is one of the capture groups (...) of the regular expression. 矩阵的每一列都是正则表达式的捕获组(...)之一。 The first column is always the whole match. 第一列始终是整个匹配项。 Therefore, the second column holds what you're looking for. 因此,第二列包含您要查找的内容。

As mentioned in the comments, look into R for Data Science > Strings and stringr documentation > Regular expressions to learn about the topic. 如评论中所述,请查阅R for Data Science>字符串Stringer文档>正则表达式以了解该主题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM