简体   繁体   English

正则表达式获取不在双引号内的第一个字符的索引

[英]Regex get index of first character not inside double quotes

I am looking for a regex to do the following:我正在寻找一个正则表达式来执行以下操作:

  • Return the index of the first instance of a given character not inside double quotes (I can guarantee a matching closing double quote will always be present and the character to search for will never itself be a double quote)返回不在双引号内的给定字符的第一个实例的索引(我可以保证始终存在匹配的结束双引号,并且要搜索的字符本身永远不会是双引号)
  • Allow starting from int startIndex position允许从int startIndex位置开始

Speed is one of my primary concerns here so the number of iterations should be as small as possible.速度是我在这里的主要关注点之一,因此迭代次数应尽可能少。

Examples (all examples set to look for ! , but this might not always be the case):示例(所有示例都设置为查找! ,但情况可能并非总是如此):

  • !something - should return 0 !something - 应该返回 0
  • ! should also return 0也应该返回 0
  • something! should return 9应该返回 9
  • "something!" should fail应该失败
  • "some!thing"! should return 12应该返回 12
  • !"!"! should return 0应该返回 0
  • ""! should return 2应该返回 2
  • ""!"" should return 2 ""!""应该返回 2
  • !something with startIndex == 2 should fail !something with startIndex == 2应该失败
  • !something! with startIndex == 2 should return 10 (despite starting at position 2, the index of the character on the given string is still 10) with startIndex == 2应该返回 10(尽管从位置 2 开始,给定字符串上的字符索引仍然是 10)

Since this is for .NET the intention is to use Regex.Match().Index (unless a better alternative is provided).由于这是针对 .NET 的,因此目的是使用Regex.Match().Index (除非提供了更好的替代方案)。

I suggest good old for loop instead of regular expressions ;我建议使用旧的for循环而不是正则表达式 let's implement it as an extension method:让我们将其实现为扩展方法:

  public static partial class StringExtensions {
    public static int IndexOfQuoted(this string value,
                                    char toFind,
                                    int startPosition = 0,
                                    char quotation = '"') {
      if (string.IsNullOrEmpty(value))
        return -1;

      bool inQuotation = false;

      for (int i = 0; i < value.Length; ++i)
        if (inQuotation)
          inQuotation = value[i] != quotation;
        else if (value[i] == toFind && i >= startPosition)
          return i;
        else
          inQuotation = value[i] == quotation;

      return -1;
    }
  }

And so, you can use it as if IndexOfQuoted a string s method:因此,您可以像使用IndexOfQuoted string方法一样使用它:

  string source = "something!";
  int result = source.IndexOfQuoted('!'); 

Demo:演示:

  string[] tests = new string[] {
    "!something",
    "!",
    "something!",
    "\"something!\"",
    "\"some!thing\"!",
    "!\"!\"!",
    "\"\"!",
    "\"\"!\"\"",
  };

  string report = string.Join(Environment.NewLine, tests
    .Select(test => $"{test,-20} -> {test.IndexOfQuoted('!')}"));

  Console.Write(report);

Outcome:结果:

!something           -> 0
!                    -> 0
something!           -> 9
"something!"         -> -1
"some!thing"!        -> 12
!"!"!                -> 0
""!                  -> 2
""!""                -> 2

If you really need regex version, you could use a pattern as follows.如果您确实需要正则表达式版本,则可以使用如下模式。

"(?<searchTerm>!)(?=(?:[^\"]|\"[^\"]*\")*$)"

Example, for input例如,对于输入

var input = new []
    {
    new {Key= "!something", BeginIndex=0},
    new {Key= "!", BeginIndex=0},
    new {Key= "something!", BeginIndex=0},
    new {Key= "\"something!\"", BeginIndex=0},
    new {Key= "\"some!thing\"!", BeginIndex=0},
    new {Key= "!\"!\"!", BeginIndex=0},
    new {Key= "\"\"!", BeginIndex=0},
    new {Key= "\"\"!\"\"", BeginIndex=0},
    new {Key= "!something", BeginIndex=2},
    new {Key= "!something!", BeginIndex=2},
    new {Key="!\"some!thing\"!",BeginIndex=5}
    };

You can search Index as follows您可以按如下方式搜索索引

var pattern = "(?<searchTerm>!)(?=(?:[^\"]|\"[^\"]*\")*$)";
Regex regex = new Regex(pattern,RegexOptions.Compiled);
foreach(var str in input)
{
    var index = str.Key.GetIndex(regex,str.BeginIndex);
    Console.WriteLine($"String:{str.Key} , Index : {index}");
}

Where GetIndex is defined as其中 GetIndex 定义为

public static class Extension
{
    public static int GetIndex(this string source,Regex regex,int beginIndex=0)
    {
        var match = regex.Match(source);
        while(match.Success)
        {   

            if(match.Groups["searchTerm"].Index >= beginIndex)
                return match.Groups["searchTerm"].Index;

            match = match.NextMatch();
        }
        return -1;
    }
}

Output输出

String:!something , Index : 0
String:! , Index : 0
String:something! , Index : 9
String:"something!" , Index : -1
String:"some!thing"! , Index : 12
String:!"!"! , Index : 0
String:""! , Index : 2
String:""!"" , Index : 2
String:!something , Index : -1
String:!something! , Index : 10
String:!"some!thing"! , Index : 13

Hope that helps.希望有帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM