简体   繁体   中英

What regex to use in C# to start matching from a word BEHIND (matching backwards...) until a match?

Let's say a code in HTML:

<a href="http://google.com">this is a search engine</a>"

How to look for "engine" and match anything until "this" gets reached?

I know I can do: this.*?engine - but this is from left to right matching, that is "ahead" matching, here I want to read backwards if this is possible at all?

You could reverse all strings and perform normal search:

string text = @"<a href=""http://google.com""> this is a search engine </a>";
string engine = "engine";
string strThis = "this";

new string(
  Regex.Match(
    new string(text.Reverse().ToArray()),
    new string(engine.Reverse().ToArray()) + ".+" + new string(strThis.Reverse().ToArray()))
 .Value
 .Reverse()
 .ToArray())

Also, to make code clearer, you could define extension method on a string , which reverses string and returns string instead of IEnumerable<char> . See this for reference.

First, always parse HTML with a dedicated tool, see What is the best way to parse html in C#? for possible options.

Once the HTML is parsed you can get plain text to run your regex against.

You may still use your this.*?engine regex but enable RegexOptions.RightToLeft option, possibly coupled with RegexOptions.Singleline to match really any chars between the two words:

var result = Regex.Match(text, @"this.*?engine", RegexOptions.Singleline | RegexOptions.RightToLeft)?.Value;

See the online regex demo .

As per the documentation, RegexOptions.RightToLeft

Gets a value that indicates whether the regular expression searches from right to left.

C# demo :

var text = "blah blah this is a this search engine blah";
var result = Regex.Match(text, @"this.*?engine", 
        RegexOptions.Singleline | RegexOptions.RightToLeft)?.Value;
Console.WriteLine(result); // => this search engine

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM