简体   繁体   English

显示正则表达式模式的所有可能匹配项

[英]Display all possible matches for a regex pattern

I have the following RegEx pattern in order to determine some 3-digit exchanges of phone numbers: 我有以下RegEx模式,以确定一些3位数的电话号码交换:

(?:2(?:04|[23]6|[48]9|50)|3(?:06|43|65)|4(?:03|1[68]|3[178]|50)|5(?:06|1[49]|79|8[17])|6(?:0[04]|13|39|47)|7(?:0[59]|78|8[02])|8(?:[06]7|19|73)|90[25])

It looks pretty daunting, but it only yields around 40 or 50 numbers. 它看起来相当令人生畏,但它只能产生大约40或50个数字。 Is there a way in C# to generate all numbers that match this pattern? 在C#中有一种方法可以生成与此模式匹配的所有数字吗? Offhand, I know I can loop through the numbers 001 thru 999, and check each number against the pattern, but is there a cleaner, built-in way to just generate a list or array of matches? 另外,我知道我可以遍历数字001到999,并根据模式检查每个数字,但有没有更清晰的内置方式来生成列表或匹配数组?

ie - {"204","226","236",...} ie - {"204","226","236",...}

No, there is no off the shelf tool to determine all matches given a regex pattern. 不,没有现成的工具来确定给定正则表达式模式的所有匹配。 Brute force is the only way to test the pattern. 蛮力是测试模式的唯一方法。

Update 更新

It is unclear why you are using (?: ) which is the "Match but don't capture". 目前还不清楚为什么你使用(?: ) :)这是“匹配但不捕获”。 It is used to anchor a match, for example take this phone text phone:303-867-5309 where we don't care about the phone: but we want the number. 它用于锚定一个匹配,例如拿这个电话文本phone:303-867-5309 ,我们不关心phone:但我们想要这个号码。

The pattern used would be 使用的模式是

(?:phone\:)(\d{3}-\d{3}-\d{4}) 

which would match the whole line, but the capture returned would just be the second match of the phone number 303-867-5309 . 这将匹配整行,但返回的捕获将只是电话号码303-867-5309的第二场比赛。

So the (?: ) as mentioned is used to anchor a match capture at a specific point; 所以上面提到的(?: ) :)用于在特定点锚定匹配捕获; with text match text thrown away. 与文本匹配文本扔掉。

With that said, I have redone your pattern with comments and a test to 2000: 话虽如此,我已经用评论和2000测试重做你的模式:

string pattern = @"
^                            # Start at beginning of line so no mid number matches erroneously found
   (
       2(04|[23]6|49|[58]0)  # 2 series only match 204, 226, 236, 249, 250, 280
     |                       # Or it is not 2, then match:
       3(06|43|65)           # 3 series only match 306, 343, 365
    )
$                            # Further Anchor it to the end of the string to keep it to 3 numbers";

// RegexOptions.IgnorePatternWhitespace allows us to put the pattern over multiple lines and comment it. Does not
//     affect regex parsing/processing.

var results = Enumerable.Range(0, 2000) // Test to 2000 so we don't get any non 3 digit matches.
                        .Select(num => num.ToString().PadLeft(3, '0'))
                        .Where (num => Regex.IsMatch(num, pattern, RegexOptions.IgnorePatternWhitespace))
                        .ToArray();

Console.WriteLine ("These results found {0}", string.Join(", ", results));

// These results found 204, 226, 236, 249, 250, 280, 306, 343, 365

I took the advice of @LucasTrzesniewski and just looped through the possible values. 我接受了@LucasTrzesniewski的建议,然后循环查看可能的值。 Since I know I'm dealing w/ 3-digit numbers, I just looped through the numbers/strings “000” thru “999” and checked for matches like this: 因为我知道我正在处理w / 3位数字,所以我只是通过数字/字符串“000”到“999”循环并检查这样的匹配:

private static void FindRegExMatches(string pattern)
{
    for (var i = 0; i < 1000; i++)
    {
        var numberString = i.ToString().PadLeft(3, '0');
        if (!Regex.IsMatch(numberString, pattern)) continue;

        Console.WriteLine("Found a match: {0}, numberString);
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM