[英]Using Regex to match quoted string with embedded, non-escaped quotes
I am trying to match a string in the following pattern with a regex. 我正在尝试使用正则表达式匹配以下模式中的字符串。
string text = "'Emma','The Last Leaf','Gulliver's travels'";
string pattern = @"'(.*?)',?";
foreach (Match match in Regex.Matches(text,pattern,RegexOptions.IgnoreCase))
{
Console.WriteLine(match + " " + match.Index);
Console.WriteLine(match.Groups[1].Captures[0]);
}
This matches "Emma" and "The Last leaf" correctly, however the third match is "Gulliver". 这正确匹配了“ Emma”和“ The Last Leaf”,但是第三个匹配是“ Gulliver”。 But the desired match is "Gulliver's travels".
但是理想的比赛是“格列佛游记”。 How can I build a regex for a patterns like this?
如何为这样的模式构建正则表达式?
Since ,
is your delimiter, you can try changing your pattern like this. 由于
,
是分隔符,因此您可以尝试更改模式。 It should work. 它应该工作。
string pattern = @"'(.*?)'(?:,|$)";
The way this works is, it looks for a single quote followed by a comma or end of the line. 它的工作方式是,查找单引号,后跟逗号或行尾。
我认为这可以将'(.*?)',|'(.*)'
作为正则表达式使用。
you may consider to use look behind /look ahead: 您可以考虑使用向后看/向前看:
"(?<=^'|',').*?(?='$|',')"
test with grep : 用grep测试 :
kent$ echo "'Emma','The Last Leaf','Gulliver's travels'"|grep -Po "(?<=^'|',').*?(?='$|',')"
Emma
The Last Leaf
Gulliver's travels
You can't, if you have single-quote delimited strings and Gulliver's
contains a single, unescaped quote there's no way to distinguish it from the end of a string. 如果您有单引号分隔的字符串,而
Gulliver's
包含一个不转义的单引号,则无法将其与字符串末尾区分开。 You could always just split it by commas and trim '
s from either side but I'm not sure that's what you want: 您总是可以用逗号将其分开,并从任一边修剪
'
,但是我不确定那是您想要的:
string text = "'Emma','The Last Leaf','Gulliver's travels'";
foreach(string s in text.split(new char[] {','})) {
Console.WriteLine(s.Trim('\''));
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.