[英]Javascript regex splitting words in a comma separated string
I am trying to split a comma separated string using regex. 我正在尝试使用正则表达式拆分逗号分隔的字符串。
var a = 'hi,mr.007,bond,12:25PM'; //there are no white spaces between commas
var b = /(\S+?),(?=\S|$)/g;
b.exec(a); // does not catch the last item.
Any suggestion to catch all the items. 任何建议捕获所有项目。
Use a negated character class: 使用否定的字符类:
/([^,]+)/g
will match groups of non-commas. 将匹配非逗号组。
< a = 'hi,mr.007,bond,12:25PM'
> "hi,mr.007,bond,12:25PM"
< b=/([^,]+)/g
> /([^,]+)/g
< a.match(b)
> ["hi", "mr.007", "bond", "12:25PM"]
Why not just use .split
? 为什么不直接使用
.split
?
>'hi,mr.007,bond,12:25PM'.split(',')
["hi", "mr.007", "bond", "12:25PM"]
If you must use regex for some reason: 如果由于某种原因必须使用正则表达式:
str.match(/(\S+?)(?:,|$)/g)
["hi,", "mr.007,", "bond,", "12:25PM"]
(note the inclusion of commas). (注意包含逗号)。
If you are passing a CSV file, some of your values may have got double-quotes around them, so you may need something a little more complicated. 如果您传递的是CSV文件,那么您的某些值可能会有双引号,因此您可能需要更复杂的内容。 For example:
例如:
Pattern splitCommas = java.util.regex.Pattern.compile("(?:^|,)((?:[^\",]|\"[^\"]*\")*)");
Matcher m = splitCommas.matcher("11,=\"12,345\",ABC,,JKL");
while (m.find()) {
System.out.println( m.group(1));
}
or in Groovy: 或者在Groovy中:
java.util.regex.Pattern.compile('(?:^|,)((?:[^",]|"[^"]*")*)')
.matcher("11,=\"12,345\",ABC,,JKL")
.iterator()
.collect { it[1] }
This code handles: 此代码处理:
The pattern consists of: 该模式包括:
(?:^|,)
matches the start of the line or a comma after the last column, but does not add that to the group (?:^|,)
匹配行的开头或最后一列之后的逗号,但不会将其添加到组
((?:[^",]|"[^"]*")*)
matches the value of the column, and consists of: ((?:[^",]|"[^"]*")*)
匹配列的值,包括:
a collecting group, which collects zero or more characters that are: 收集组,收集零个或多个字符:
[^",]
is a character that's not a comma or a quote [^",]
是一个不是逗号或引号的字符 "[^"]*"
is a double-quote followed by zero or more other characters ending in another double-quote "[^"]*"
是双引号,后跟零个或多个以另一个双引号结尾的其他字符 those are or-ed together, using a non-collecting group: (?:[^",]|"[^"]*")
那些是使用非收集组共同编排的:
(?:[^",]|"[^"]*")
*
to repeat the above any number of times: (?:[^",]|"[^"]*")*
*
重复上述任何次数: (?:[^",]|"[^"]*")*
((?:[^",]|"[^"]*")*)
((?:[^",]|"[^"]*")*)
Doing escaping of double quotes is left as an exercise to the reader 逃避双引号是留给读者的练习
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.