[英]Regex - how to match multiple properly quoted substrings
我正在尝试使用正则表达式从(C#)字符串中提取引号包装的字符串,该字符串是这种字符串的逗号分隔列表。 我需要提取所有正确引用的子字符串,并忽略那些缺少引号的字符串
例如,给这个字符串
“动物,狗,猫”,“大肠杆菌,验证,”链球菌“
我需要提取“动物,狗,猫”和“链球菌”。
我在这个论坛上尝试了各种正则表达式解决方案,但他们似乎都只找到第一个子串,或者错误地匹配“大肠杆菌,验证”,忽略“链球菌”
这可以解决吗?
TIA
尝试这个:
string input = "\"animal,dog,cat\",\"ecoli, verification,\"streptococcus\"";
string pattern = "\"([^\"]+?[^,])\"";
var matches = Regex.Matches(input, pattern);
foreach (Match m in matches)
Console.WriteLine(m.Groups[1].Value);
PS但我赞同评论员:修复来源。
我建议这个:
"(?>[^",]*(?>,[^",]+)*)"
说明:
" # Match a starting quote
(?> # Capture in an atomic group to avoid catastrophic backtracking:
[^",]* # - any number of characters except commas or quotes
(?> # - optionally followed by another (atomic) group:
, # - which starts with a comma
[^",]+ # - and contains at least one character besides comma or quotes.
)* # - (as said above, that group is optional but may occur many times)
) # End of the outer atomic group
" # Match a closing quote
在regex101.com上测试它。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.