I need to split by comma in the text but the text also has a comma inside brackets which need to be ignored
Input text : Selectroasted peanuts, Sugars (sugar, fancymolasses) ,Hydrogenatedvegetable oil (cottonseed and rapeseed oil),Salt.
Expected output:
MyCode
string pattern = @"\s*(?:""[^""]*""|\([^)]*\)|[^, ]+)";
string input = "Selectroasted peanuts,Sugars (sugar, fancymolasses),Hydrogenatedvegetable oil (cottonseed and rapeseed oil),Salt.";
foreach (Match m in Regex.Matches(input, pattern))
{
Console.WriteLine("{0}", m.Value);
}
The output I am getting:
Please help.
You can use
string pattern = @"(?:""[^""]*""|\([^()]*\)|[^,])+";
string input = "Selectroasted peanuts,Sugars (sugar, fancymolasses),Hydrogenatedvegetable oil (cottonseed and rapeseed oil),Salt.";
foreach (Match m in Regex.Matches(input.TrimEnd(new[] {'!', '?', '.', '…'}), pattern))
{
Console.WriteLine("{0}", m.Value);
}
// => Selectroasted peanuts
// Sugars (sugar, fancymolasses)
// Hydrogenatedvegetable oil (cottonseed and rapeseed oil)
// Salt
See the C# demo . See the regex demo , too. It matches one or more occurrences of
"[^"]*"
- "
, zero or more chars other than "
and then a "
|
- or \\([^()]*\\)
- a (
, then any zero or more chars other than (
and )
and then a )
char |
- or [^,]
- a char other than a ,
. Note the .TrimEnd(new[] {'!', '?', '.', '…'})
part in the code snippet is meant to remove the trailing sentence punctuation, but if you can affort Salt.
in the output, you can remove that part.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.