简体   繁体   中英

javascript/regex to ignore semicolons in double quotes

I've been stumped for bit on this one - I have a string that is almost a semicolon delimited string it would be something like this:

one; two; three "four; five;six"; seven

I'd like to split this up using a regex in javascript into an array like this (eg ignoring any semicolons inside double quotes):

['one','two','three "four; five;six"','seven']

I've tried adapting known working CSV functions, but they seem to be able to be adapted to work with the third element ('three "four;five;six";').

It seems like a regex type of problem, but if a solution exists using more than regex, I'm certainly interested!

update : I should also note that there may be spaces before or after the semicolons in the quoted string. I've updated the example to reflect that.

Assuming you don't allow for escaped quotes inside your quotes (eg "this has \\"escaped quotes\\" inside" ) then this should work:

var rx = /(?!;|$)[^;"]*(("[^"]*")[^;"]*)*/g;
var str = 'one; two; three "four;five;six"; seven';
var res = str.match(rx)
// res = ['one', ' two', ' three "four;five;six"', ' seven']

Note that you need the negative-lookahead (?!;|$) at the beginning of the regex to keep it from matching the empty string, otherwise the match method matches empty strings in front of each of the semicolons for some reason.

Update:

I think this regular expression should work with escaped quotes as well (although I'd appreciate feedback on the correctness). I've also added the extra \\s in the negative-lookahead pattern to strip off whitespace after the preceding semicolon.

/(?!\s|;|$)[^;"]*("(\\.|[^\\"])*"[^;"]*)*/g

This strips spaces before and after semicolons:

'one; two; three "four;five;six"; seven'.match(/(?!;| |$)([^";]*"[^"]*")*([^";]*[^ ";])?/g)

['one', 'two', 'three";four;five;six"', 'seven']

'one ; two"; three ; "four" ; five ; "six ; seven'.match(/(?!;| |$)([^";]*"[^"]*")*([^";]*[^ ";])?/g)

['one', 'two" ; three ; "four" ; five ; "six', 'seven']

It doesn't try to deal with escaped quotes though.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM