简体   繁体   中英

Regex to split and ignore brackets

I need to split by comma in the text but the text also has a comma inside brackets which need to be ignored

Input text : Selectroasted peanuts, Sugars (sugar, fancymolasses) ,Hydrogenatedvegetable oil (cottonseed and rapeseed oil),Salt.

Expected output:

  • Selectroasted peanuts
  • Sugars (sugar, fancymolasses)
  • Hydrogenatedvegetable oil (cottonseed and rapeseed oil)
  • Salt

MyCode

string pattern = @"\s*(?:""[^""]*""|\([^)]*\)|[^, ]+)";
string input = "Selectroasted peanuts,Sugars (sugar, fancymolasses),Hydrogenatedvegetable oil (cottonseed and rapeseed oil),Salt."; 
foreach (Match m in Regex.Matches(input, pattern)) 
{ 
Console.WriteLine("{0}", m.Value); 
}

The output I am getting:

  • Selectroasted
  • peanuts
  • Sugars
  • (sugar, fancymolasses)
  • Hydrogenatedvegetable
  • oil
  • (cottonseed and rapeseed oil)
  • Salt

Please help.

You can use

string pattern = @"(?:""[^""]*""|\([^()]*\)|[^,])+";
string input = "Selectroasted peanuts,Sugars (sugar, fancymolasses),Hydrogenatedvegetable oil (cottonseed and rapeseed oil),Salt."; 
foreach (Match m in Regex.Matches(input.TrimEnd(new[] {'!', '?', '.', '…'}), pattern)) 
{ 
    Console.WriteLine("{0}", m.Value); 
}
// => Selectroasted peanuts
//    Sugars (sugar, fancymolasses)
//    Hydrogenatedvegetable oil (cottonseed and rapeseed oil)
//    Salt

See the C# demo . See the regex demo , too. It matches one or more occurrences of

  • "[^"]*" - " , zero or more chars other than " and then a "
  • | - or
  • \\([^()]*\\) - a ( , then any zero or more chars other than ( and ) and then a ) char
  • | - or
  • [^,] - a char other than a , .

Note the .TrimEnd(new[] {'!', '?', '.', '…'}) part in the code snippet is meant to remove the trailing sentence punctuation, but if you can affort Salt. in the output, you can remove that part.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM