[英]Regex to match comma-separated words with final "and" clause
我需要一個正則表達式,它與英語語言列表中的單詞或短語匹配,采用以下形式之一:
換句話說,正則表達式允許我識別英語短語列表中的每個短語,其中除了最后一個短語(如果有兩個以上的短語)之外的所有短語都用逗號分隔,最后的“and”可能會也可能不會以逗號開頭。
獲取逗號分隔的匹配項很容易:
[^,]+
但我不知道如何處理可選的最終“和”分隔符(沒有前面的逗號)。
一種方法是將字符串拆分為and
(可選地以逗號開頭)或逗號:
string[] inp = new string[] {
"Some words",
"Some words and some other words",
"Some words, more words and some other words",
"Some words, more words, and some other words"
};
foreach (string s in inp) {
string[] phrases = (Regex.Split(s, @"(?:,\s*|\s+)and\s+|,\s*"));
Console.WriteLine(string.Join("\n", phrases));
}
輸出:
Some words
Some words
some other words
Some words
more words
some other words
Some words
more words
some other words
您可以在Regex.Split
中使用以下模式:
\s*(?:(?:,\s*)?\band\s+|,\s*)
請參閱正則表達式演示。
詳情:
\s*
- 零個或多個空格(?:(?:,\s*)?\band\s+|,\s*)
- 兩種選擇之一:
(?:,\s*)?\band\s+
- 一個可選的逗號序列和零個或多個空格,然后是一個完整的單詞and
一個或多個空格字符|
- 或者,\s*
- 一個逗號和零個或多個空格。請參閱 C# 演示:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
public class Test
{
public static void Main()
{
var texts = new List<string> {
"Some words",
"Some words and some other words",
"Some words, more words and some other words",
"Some words, more words, and some other words"
};
var pattern = @"\s*(?:(?:,\s*)?\band\s+|,\s*)";
foreach (var text in texts)
{
var result = Regex.Split(text, pattern).Where(x => !String.IsNullOrWhiteSpace(x)).ToList();
Console.WriteLine("'{0}' => ['{1}']", text, string.Join("', '", result));
}
}
}
輸出:
'Some words' => ['Some words']
'Some words and some other words' => ['Some words', 'some other words']
'Some words, more words and some other words' => ['Some words', 'more words', 'some other words']
'Some words, more words, and some other words' => ['Some words', 'more words', 'some other words']
你可以試試
[一些|一些|更多]+\s(?:[az]+)?\s?words
希望對您有所幫助!
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.