简体   繁体   English

正则表达式识别C#中的(X,Y,...和Z)?

[英]Regular expression to recognize (X, Y, … and Z) in C#?

Given the set of input strings below: 给定下面的输入字符串集:

  • one, two, and three 一,二,三
  • one, two, three, and four 一,二,三和四
  • one, two, three, four and five 一,二,三,四和五

(... and so on for the N + 1 cases) (......等N + 1个案例)

How can can I construct a regular expression that can recognize phrases like this for any number of nouns and return each comma delimited noun and the final noun that follows the and conjunction as a separate capture group? 我如何可以构造一个正则表达式可以识别短语这样的任何数量的名词,并把每个逗号分隔的名词和后面的最终名词结合作为一个单独的捕获组? If it is not possible, what approach would you use to parsing and capturing input such as this? 如果不可能,您将使用什么方法来解析和捕获这样的输入? I'm using the C# Regex object for parsing. 我正在使用C#Regex对象进行解析。

Note, the nouns here are just sample data (one, two, three, four, five, etc.) and the spaces following the commas might be omitted. 注意,这里的名词只是样本数据(一,二,三,四,五等),逗号后面的空格可能会被省略。 Also, the nouns might be multi-word phrases separated by spaces 此外,名词可能是由空格分隔的多词短语

Bonus round : if you can cleanly also recognize the non-comma delimited cases for (example) "one" and "one and two" in the same expression with capturing. 奖金回合 :如果你可以干净地识别(逗号)“一个”和“一个和两个”的非逗号分隔的情况,同一个表达式中有捕获。

Try this regex 试试这个正则表达式

\\b((?!and)\\w+)\\b

Regex Demo : http://regex101.com/r/kC5rR2 正则表达式演示: http//regex101.com/r/kC5rR2

You can also check the result at RegexPal 您还可以在RegexPal查看结果

在此输入图像描述

Try this. 尝试这个。 However I can't get ride of the "," in the final match of the form "two, and three" 但是,在“两个和三个”形式的最后一场比赛中,我无法获得“,”

(?<word>\w+,* and \w+)|(?<word>(?<=^|,\s?|and )\w+)

I'd use a simple method instead of Regex just to keep the code simple and readable to other developers. 我使用一种简单的方法代替Regex只是为了让代码对其他开发人员来说简单易读。

Following code shows you this method in action using a console app. 以下代码使用控制台应用程序向您显示此方法。 Hope it helps you. 希望它能帮到你。

Cheers! 干杯!

class Program
    {
        static void Main(string[] args)
        {
            string input = "one, two, three, four, five, thirty one and six";

            // Get all nouns into a string array
            string [] allNouns = getNouns(input);

            // Print the result
            foreach (string noun in allNouns)
            {
                Console.WriteLine(noun);                
            }
            Console.ReadLine();
        }

        private static string[] getNouns(string input)
        {
            string[] nouns = input.Split(',');

            if (input.ToLower().IndexOf("and") > 0 && nouns.Length > 0)
            {
                string[] lastTwoNouns = nouns[nouns.Length - 1].Trim().ToLower().Replace("and", "~").Split('~');

                Array.Resize(ref nouns, nouns.Length + 1);

                nouns[nouns.Length - 2] = lastTwoNouns[0];
                nouns[nouns.Length - 1] = lastTwoNouns[1];
            }

            for (int i = 0; i < nouns.Length; i++)
            {
                nouns[i] = nouns[i].Trim();
            }

            return nouns;
        }
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM