Extract the consecutive words from the middle of a string in R

Question

Ok, so I'm new to regexpr and my brain is about to fry. What I would like to do is extract words two and three from a string. For example:

sentence <- "Certified 2017 Mazda CX-5 AWD Touring"
TheFunction(sentence)

should return "2017 Mazda"

My initial attempt is using something like:

sub("\\s\\S+\\s\\S+\\s", "\\1", sentence)

but is failing miserably. My idea is to find the first pattern that matches "space-word-space-word-space"

Answer 1

您可以使用 strsplit 然后将第二个和第三个单词粘贴在一起

paste(strsplit(sentence, split = '\\s')[[1]][2:3], collapse = " ")