简体   繁体   中英

How to split a comma delimited string with embedded quoted strings?

I have a string and I want to split this string into an array as follows:

string stemp = "a,b,c,\"d,e f\",g,h";
array[0] = a
array[1] = b
array[2] = c
array[3] = d,e f
array[4] = g
array[5] = h

I have tried following syntax

string array[] = null;
array = stemp.split(',');

This looks like CSV - which is not so simple to parse (when taking escapes into consideration).

I suggest using a CSV parser, such as the TextFieldParser class that lives in the Microsoft.VisualBasic.FileIO namespace.

There are many alternatives, such as FileHelpers .

Using a CSV parser is probably the right solution but you can also use a regular expression:

var stemp = @"a,b,c,""d,e f"",g,h";
var regex = new Regex(@"^(?:""(?<item>[^""]*)""|(?<item>[^,]*))(?:,(?:""(?<item>[^""]*)""|(?<item>[^,]*)))*$");
var array = regex
  .Match(stemp)
  .Groups["item"]
  .Captures
  .Cast<Capture>()
  .Select(c => c.Value)
  .ToArray();

Unfortunately regular expressions tend to be incomprehensible so here are a short description of the individual parts:

""(?<item>[^""]*)""

This matches "d,ef" .

(?<item>[^,]*)

This matches a and b etc. Both expressions capture the relevant part to the named group item .

These expressions (lets call them A and B ) are combined using an alternation construct and grouped using a non-capturing group:

(?:A|B)

Lets call this new expression C . The entire expression is then (again using a non-capturing group):

^C(?:,C)*$

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM