简体   繁体   English

在特殊字符之间提取文本R.

[英]extracting text R between special characters

I have multiple strings as shown below: 我有多个字符串,如下所示:

filename="numbers [www.imagesplitter.net]-0-0.jpeg"
filename1="numbers [www.imagesplitter.net]-0-1.jpeg"
filename2="numbers [www.imagesplitter.net]-19-9.jpeg"

I want the text that appears between the second "-" and the last period. 我想要在第二个“ - ”和最后一个句点之间出现的文本。 I would like to get 0,1,9 respectively. 我想分别得到0,1,9。

How do I do this? 我该怎么做呢? I am not sure how to detect the second "-" and the last period. 我不知道如何检测第二个“ - ”和最后一个时期。

Try 尝试

sub('^[^-]*-[^-]*-(\\d+)\\..*$', '\\1', files)
#[1] "0" "1" "9"

or 要么

 gsub('^[^-]*-[^-]*-|\\..*$', '', files)
 #[1] "0" "1" "9"

data 数据

files <- c(filename, filename1, filename2)

Try this: 尝试这个:

files=c(filename, filename1, filename2)

sub(".*-(.+)\\.jpeg", "\\1", files)

You could use regmatches function also. 你也可以使用regmatches功能。

> x <- c("numbers [www.imagesplitter.net]-0-0.jpeg","numbers [www.imagesplitter.net]-0-1.jpeg", "numbers [www.imagesplitter.net]-19-9.jpeg")
> unlist(regmatches(x, gregexpr("^(?:[^-]*-){2}\\K.*(?=\\.)", x, perl=TRUE)))
[1] "0" "1" "9"

You could use the same regex in stringr , str_extract_all function also. 您可以在stringr使用相同的正则表达式, str_extract_all函数也是如此。

> library(stringr)
> unlist(str_extract_all(x, perl("^(?:[^-]*-){2}\\K.*(?=\\.)")))
[1] "0" "1" "9"

OR 要么

> unlist(str_extract_all(x, perl("(?<=-)[^-.]*(?=\\.)")))
[1] "0" "1" "9"

OR 要么

> unlist(str_extract_all(x, perl(".*-\\K\\d+")))
[1] "0" "1" "9"

I would simply use strsplit to split the strings accordingly here: 我只想使用strsplit在这里拆分字符串:

sapply(strsplit(files, '[-.]'), '[', 5)
# [1] "0" "1" "9"

you can try 你可以试试

sub("^[^-]+-[^-]+-(.*)\\.[^\\.]*$", "\\1", c(filename, filename1, filename2))
[1] "0" "1" "9"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM