简体   繁体   中英

Positive lookbehind and lookahead failing when using plus sign as separator

I have the following string:

city-Rio de Janeiro+Sao Paulo+Belo Horizonte

and I'm using the following regex to try capture city names:

(?<=city\-|\+)(?<city>[a-zA-Z\s+\-]+)(?=\+|$)

unfortunelly the regex above is returning one big group, like this:

Rio de Janeiro+Sao Paulo+Belo Horizonte

if I change the separator in the source string and regex propertly everything works fine, but I would like to use the plus sign as separator, how can I do that?

It matches that much because a + inside a character class (the square brackets) matches the literal '+' . Remove it:

(?<=city-|\+)(?<city>[a-zA-Z\s-]+)(?=\+|$)

and you'd get 3 matches:

  1. Rio de Janeiro
  2. Sao Paulo
  3. Belo Horizonte

as the following test proves:

在此输入图像描述

And a small C# test with Ideone :

using System;
using System.Text.RegularExpressions;

class Example 
{
   static void Main() 
   {
      string text = "city-Rio de Janeiro+Sao Paulo+Belo Horizonte";
      string pat = @"(?<=city-|\+)(?<city>[a-zA-Z\s-]+)(?=\+|$)";

      Regex r = new Regex(pat);
      Match m = r.Match(text);

      while (m.Success) 
      {
         Console.WriteLine("found: '" + m.Groups[1] + "'");
         m = m.NextMatch();
      }
   }
}

produced the following output:

found: 'Rio de Janeiro'
found: 'Sao Paulo'
found: 'Belo Horizonte'

Also note that at the end of a class, and outside a character class, the - does not need to be escaped.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM