简体   繁体   中英

Regex - Match Last Occurrence

I have a text file full of names, I want to match them all via Regex.

Each name ends with the following text: fsa fwb fcc, eg:

">Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc

I want to use the following expression to match the names:

""">.+?""fsa fwb fcc"

AKA match all text from "> up to fsa fwb fcc , I can then parse the excess matched myself.

However as "> occurs throughout the file, it starts matching from much earlier. I have always wondered how to match from the LAST occurance of something, in this case, "> , up to the end specified.

You can try this:-

.+((fsa|fwb|fcc).+)$

+ matches many characters in front.

((fsa|fwb|fcc) matches and captures the keywords.

.+) matches and captures characters.

$ matches the end of the line.

EDIT:- As suggested by m.buettner RegexOptions.RightToLeft should work for your case.

Description

It looks like you're ending string is literally fsa fwb fcc , and the beginning of the substring you're interested in starts directly after the last "> before the end string.

This expression will:

  • find the substring between the last "> and the next fsa fwb fcc

">((?:(?!">).)*)fsa\\sfwb\\sfcc

在此处输入图片说明

Live Demo

Sample Text

">sometext">A Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
">sometext">B Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
">sometext">C Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc

Matches Found:

[0][0] = ">A Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
[0][1] = A Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"

[1][0] = ">B Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
[1][1] = B Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"

[2][0] = ">C Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
[2][1] = C Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"

Or

If you want to go further and only capture from the last "> through to the \ before the fsa fwb fcc ... ie the actual name and not the markup text, then have a look at this expression

">((?:(?!">).)*?)\\\(?:(?!">).)*fsa\\sfwb\\sfcc

在此处输入图片说明

Live Demo

Sample Text

">sometext">A Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
">sometext">B Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
">sometext">C Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc

Matches Found

[0][0] = ">A Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
[0][1] = A Dave Smith

[1][0] = ">B Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
[1][1] = B Dave Smith

[2][0] = ">C Dave Smith\u0012\/a>\u0012\/div>\u0012div class=\"fsa fwb fcc
[2][1] = C Dave Smith

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM