简体   繁体   中英

Regular expression to recognize (X, Y, … and Z) in C#?

Given the set of input strings below:

  • one, two, and three
  • one, two, three, and four
  • one, two, three, four and five

(... and so on for the N + 1 cases)

How can can I construct a regular expression that can recognize phrases like this for any number of nouns and return each comma delimited noun and the final noun that follows the and conjunction as a separate capture group? If it is not possible, what approach would you use to parsing and capturing input such as this? I'm using the C# Regex object for parsing.

Note, the nouns here are just sample data (one, two, three, four, five, etc.) and the spaces following the commas might be omitted. Also, the nouns might be multi-word phrases separated by spaces

Bonus round : if you can cleanly also recognize the non-comma delimited cases for (example) "one" and "one and two" in the same expression with capturing.

Try this regex

\\b((?!and)\\w+)\\b

Regex Demo : http://regex101.com/r/kC5rR2

You can also check the result at RegexPal

在此输入图像描述

Try this. However I can't get ride of the "," in the final match of the form "two, and three"

(?<word>\w+,* and \w+)|(?<word>(?<=^|,\s?|and )\w+)

I'd use a simple method instead of Regex just to keep the code simple and readable to other developers.

Following code shows you this method in action using a console app. Hope it helps you.

Cheers!

class Program
    {
        static void Main(string[] args)
        {
            string input = "one, two, three, four, five, thirty one and six";

            // Get all nouns into a string array
            string [] allNouns = getNouns(input);

            // Print the result
            foreach (string noun in allNouns)
            {
                Console.WriteLine(noun);                
            }
            Console.ReadLine();
        }

        private static string[] getNouns(string input)
        {
            string[] nouns = input.Split(',');

            if (input.ToLower().IndexOf("and") > 0 && nouns.Length > 0)
            {
                string[] lastTwoNouns = nouns[nouns.Length - 1].Trim().ToLower().Replace("and", "~").Split('~');

                Array.Resize(ref nouns, nouns.Length + 1);

                nouns[nouns.Length - 2] = lastTwoNouns[0];
                nouns[nouns.Length - 1] = lastTwoNouns[1];
            }

            for (int i = 0; i < nouns.Length; i++)
            {
                nouns[i] = nouns[i].Trim();
            }

            return nouns;
        }
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM