[英]Regex - how to match multiple properly quoted substrings
我正在嘗試使用正則表達式從(C#)字符串中提取引號包裝的字符串,該字符串是這種字符串的逗號分隔列表。 我需要提取所有正確引用的子字符串,並忽略那些缺少引號的字符串
例如,給這個字符串
“動物,狗,貓”,“大腸桿菌,驗證,”鏈球菌“
我需要提取“動物,狗,貓”和“鏈球菌”。
我在這個論壇上嘗試了各種正則表達式解決方案,但他們似乎都只找到第一個子串,或者錯誤地匹配“大腸桿菌,驗證”,忽略“鏈球菌”
這可以解決嗎?
TIA
嘗試這個:
string input = "\"animal,dog,cat\",\"ecoli, verification,\"streptococcus\"";
string pattern = "\"([^\"]+?[^,])\"";
var matches = Regex.Matches(input, pattern);
foreach (Match m in matches)
Console.WriteLine(m.Groups[1].Value);
PS但我贊同評論員:修復來源。
我建議這個:
"(?>[^",]*(?>,[^",]+)*)"
說明:
" # Match a starting quote
(?> # Capture in an atomic group to avoid catastrophic backtracking:
[^",]* # - any number of characters except commas or quotes
(?> # - optionally followed by another (atomic) group:
, # - which starts with a comma
[^",]+ # - and contains at least one character besides comma or quotes.
)* # - (as said above, that group is optional but may occur many times)
) # End of the outer atomic group
" # Match a closing quote
在regex101.com上測試它。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.