简体   繁体   中英

R: Replacing multiple matches with matches

I am trying to shorten some regex matches in strings. Here is an example

vYears = c('Democrat 2000-2004',
                 'Democrat 2004-2008',
                 'Democrat 2008-2012',
                 'Republican 2000-2004',
                 'Republican 2004-2008',
                 'Republican 2008-2012',
                 'Tossup')

I can match the expression that I want, and get the matches, like so

grepYears = gregexpr('20[0-9]{2}', vYears)
regmatches(vYears, grepYears)

However, I am trying to shorten the strings to

vYearsShort = c('Democrat 00-04',
           'Democrat 04-08',
           'Democrat 08-12',
           'Republican 00-04',
           'Republican 04-08',
           'Republican 08-12',
           'Tossup')

How can I achieve this?

You could use gsub . Make use of backreferences to capture the desired part:

> vYears = c('Democrat 2000-2004',
+                  'Democrat 2004-2008',
+                  'Democrat 2008-2012',
+                  'Republican 2000-2004',
+                  'Republican 2004-2008',
+                  'Republican 2008-2012',
+                  'Tossup')
> vYearsShort = gsub("20([0-9]{2})", "\\1", vYears)
> vYearsShort
[1] "Democrat 00-04"   "Democrat 04-08"   "Democrat 08-12"   "Republican 00-04"
[5] "Republican 04-08" "Republican 08-12" "Tossup"          

You can match the following regex:

^(\\w+\\s)20(\\d{2}-)20(\\d{2})$

and replace with:

\\1\\2\\3 or $1$2$3 or \\\\1\\\\2\\\\3

for earch string in your array.

DEMO

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM