[英]Javascript/RegEx: Split a string by commas but ignore commas within double-quotes
I know similar questions are available but I could not find this case.我知道有类似的问题,但我找不到这个案例。
CASE 1: 'a,b,c,d,e'
案例1:
'a,b,c,d,e'
OUTPUT: ["a", "b", "c", "d", "e"]
OUTPUT:
["a", "b", "c", "d", "e"]
CASE 2: 'a,b,"c,d", e'
案例 2:
'a,b,"c,d", e'
OUTPUT: ["a", "b", "c,d", "e"]
OUTPUT:
["a", "b", "c,d", "e"]
CASE 3: 'a,,"c,d", e'
案例 3:
'a,,"c,d", e'
OUTPUT: ["a", "", "c,d", "e"]
OUTPUT:
["a", "", "c,d", "e"]
RegEx that I tried: (".*?"|[^",]+)(?=\s*,|\s*$)
我尝试过的正则表达式:
(".*?"|[^",]+)(?=\s*,|\s*$)
RegEx Link: https://regex101.com/r/xImG4i/1正则表达式链接: https://regex101.com/r/xImG4i/1
This regex works well with CASE1 and CASE2 But is failing for CASE3.此正则表达式适用于 CASE1 和 CASE2,但不适用于 CASE3。 Insead it works for
Insead 适用于
'a,,"c,d", e'
, giving output as ["a", " ", "c,d", "e"]
'a,,"c,d", e'
,给出 output 为["a", " ", "c,d", "e"]
which is also fine but need to work for CASE3 also.这也很好,但也需要为 CASE3 工作。
Thanks in advance提前致谢
You might take optional whitespace chars between 2 comma's if a lookbehind is supported.如果支持后视,您可以在 2 个逗号之间使用可选的空白字符。
"[^"]*"|[^\s,'"]+(?:\s+[^\s,'"]+)*|(?<=,)\s*(?=,)
const regex = /"[^"]*"|[^\s,'"]+(?:\s+[^\s,'"]+)*|(?<=,)\s*(?=,)/g; [ `'a,b,c,d,e'`, `'a,b,"c,d", e'`, `'a,,"c,d", e'`, ` xz a,, b, c, "d, e, f", g, h`, `'a,,"c,d", e'`, ].forEach(s => console.log(s.match(regex)) )
If you don't want the double quotes you can use a capture group with matchAll and check for the group in the callback.如果您不想要双引号,您可以使用带有 matchAll 的捕获组并检查回调中的组。
const regex = /"([^"]*)"|[^\s,'"]+(?:\s+[^\s,'"]+)*|(?<=,)\s*(?=,)/g; [ `'a,b,c,d,e'`, `'a,b,"c,d", e'`, `'a,,"c,d", e'`, ` xz a,, b, c, "d, e, f", g, h`, `'a,,"c,d", e'`, ].forEach(s => console.log(Array.from(s.matchAll(regex), m => m[1]? m[1]: m[0])) )
An alternate solution that uses a regex for splitting instead of matching:使用正则表达式进行拆分而不是匹配的替代解决方案:
/,\s*(?=(?:(?:[^"]*"){2})*[^"]*$)/
This regex will split on comma followed by optional spaces if those are outside double quotes by using a lookahead to make sure there are even number of quotes after comma+space.如果这些在双引号之外,则此正则表达式将以逗号分隔,然后是可选空格,通过使用前瞻确保逗号+空格后有偶数个引号。
Code Sample:代码示例:
const re = /,\s*(?=(?:(?:[^"]*"){2})*[^"]*$)/; [ `a,b,"c,d", e`, `a,,"c,d", e`, ` xz a,, b, c, "d, e, f", g, h`, `a,,"c,d", e`, ].forEach(s => { tok = s.split(re); tok.forEach((e, i) => tok[i] = e.replace(/^"|"$/g, '')) console.log(s, '::', tok); })
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.