简体   繁体   English

是否有R函数提取所有数字,然后提取特定的模式?

[英]Is there an R function to extract all numbers followed by specific pattern?

I am working R. I want to extract all numbers between the last blank space and a string pattern ("-APPLE") in a vector. 我正在研究R。我想提取向量中最后一个空格和字符串模式(“ -APPLE”)之间的所有数字。 The numbers can be of variable length. 这些数字可以是可变长度的。

test_string = c("ABC 2-APPLE", "123 25-APPLE", "DEF GHI 567-APPLE", "ORANGE")

Expected Result set should be a vector as in c(2, 25, 567, NA) 预期结果集应为向量,如c(2,25,567,NA)

See Regex group capture in R with multiple capture-groups for an example of using str_match() , from the stringr package. 有关在stringr包中使用str_match()的示例,请参见R中具有多个捕获组的Regex组捕获

In your case: 在您的情况下:

> test_string = c("ABC 2-APPLE", "123 25-APPLE", "DEF GHI 567-APPLE")
> 
> library(stringr)
> x <- str_match(test_string, " ([0-9]+)-APPLE$")[,2]
> as.numeric(x)
[1]   2  25 567

you can use the "rebus" package, which is very user-friendly in creating the regex patterns you need. 您可以使用“ rebus”包,该包在创建所需的正则表达式模式时非常人性化。

library(rebus)
## adjust the lo and hi arguments of dgt() based on your text

rx <- lookbehind(SPACE) %R% dgt(1,5) %R% lookahead("-APPLE")
str_extract(test_string, rx)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM