简体   繁体   中英

Extract the consecutive words from the middle of a string in R

Ok, so I'm new to regexpr and my brain is about to fry. What I would like to do is extract words two and three from a string. For example:

sentence <- "Certified 2017 Mazda CX-5 AWD Touring"
TheFunction(sentence)

should return "2017 Mazda"

My initial attempt is using something like:

sub("\\s\\S+\\s\\S+\\s", "\\1", sentence)

but is failing miserably. My idea is to find the first pattern that matches "space-word-space-word-space"

您可以使用 strsplit 然后将第二个和第三个单词粘贴在一起

paste(strsplit(sentence, split = '\\s')[[1]][2:3], collapse = " ")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM