简体   繁体   中英

Improve pattern matching in R using grep

Please help me to write a regex pattern in R

string1<-"kk<-"ccjar_neutral v_neutral vaux_neutral nnp_neutral prn_neutral v_neutral inj_neutral"
pattern="\\bv+\\_+[a-z]+\\s+[a-z]+\\_+[a-z]{1,10}\\b"
grep(pattern,string1)

The above pattern is not getting the next word only if it is "vaux". It is matching all the next words. please help me to write a pattern that matches only if v_neutral follows vaux_neutral Also please explain the purpose of {} while writing a pattern.

you can make use of a lookahead ?= .

v_neutral(?=\\\\s+vaux_neutral)

?=\\\\s+vaux_neutral : looks ahead to see if there is one or more spaces followed by vaux_neutral

v_neutral : matches v_neutral if the look ahead condition is satisfied

see demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM