简体   繁体   中英

Using Regex to match quoted string with embedded, non-escaped quotes

I am trying to match a string in the following pattern with a regex.

string text = "'Emma','The Last Leaf','Gulliver's travels'";
string pattern = @"'(.*?)',?";

foreach (Match match in Regex.Matches(text,pattern,RegexOptions.IgnoreCase))
 {
    Console.WriteLine(match + " " + match.Index);
    Console.WriteLine(match.Groups[1].Captures[0]);
 }

This matches "Emma" and "The Last leaf" correctly, however the third match is "Gulliver". But the desired match is "Gulliver's travels". How can I build a regex for a patterns like this?

Since , is your delimiter, you can try changing your pattern like this. It should work.

string pattern = @"'(.*?)'(?:,|$)"; 

The way this works is, it looks for a single quote followed by a comma or end of the line.

我认为这可以将'(.*?)',|'(.*)'作为正则表达式使用。

you may consider to use look behind /look ahead:

 "(?<=^'|',').*?(?='$|',')"

test with grep :

kent$  echo "'Emma','The Last Leaf','Gulliver's travels'"|grep -Po "(?<=^'|',').*?(?='$|',')"
Emma
The Last Leaf
Gulliver's travels

You can't, if you have single-quote delimited strings and Gulliver's contains a single, unescaped quote there's no way to distinguish it from the end of a string. You could always just split it by commas and trim ' s from either side but I'm not sure that's what you want:

string text = "'Emma','The Last Leaf','Gulliver's travels'";

foreach(string s in text.split(new char[] {','})) {
    Console.WriteLine(s.Trim('\''));
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM