简体   繁体   English

通过正则表达式分割字符串

[英]Splitting string by regular expression

I have the following code snippet: 我有以下代码片段:

var colorText = "red,blue,green,yellow";
var colors3 = colorText.split(/[^\,]+/);
alert(colors3); // ["", ",", ",", ",", ""]

I don't understand what's going on here. 我不明白这里发生了什么。 As far as I understand, the regular expression will match any commas at the beginning of a string, and it matches 1 or more of these strings. 据我了解,正则表达式将匹配字符串开头的任何逗号,并且匹配这些字符串中的1个或多个。 What happens when we provide this regular expression as the argument to split ? 当我们提供此正则表达式作为split参数时会发生什么? Surely, if we just tried to match the regex against colorText , we'd be getting no match, because the starting character is not a comma. 当然,如果我们只是尝试将正则表达式与colorText进行匹配,则不会匹配,因为起始字符不是逗号。 But how does the regex provided to split lead to an array of commas and two empty string on each side? 但是提供的用于split的正则表达式如何导致逗号数组和每边两个空字符串?

Why do you need a regex when you can simply do split(',') ? 为什么只需要split(',')就需要正则表达式?

 var colorText = "red,blue,green,yellow"; var colors3 = colorText.split(','); console.log(colors3); 

If you want to select everything but the comma then maybe using match is a better idea. 如果要选择除逗号以外的所有内容,则最好使用match

 var colorText = ",red,blue,green,yellow"; var colors3 = colorText.match(/[^\\,]+/g); console.log(colors3); 

As explained in MDN web docs [^xyz] MDN网络文档 [^xyz]

A negated or complemented character set. 否定或补充的字符集。 That is, it matches anything that is not enclosed in the brackets. 也就是说,它匹配方括号中未包含的任何内容。

Your regex /[^\\,]+/ will match any sequence of characters that doesn't include any comma. 您的正则表达式/[^\\,]+/将匹配不包含任何逗号的任何字符序列。

So your regex will match these sequences in colorText : 因此,您的正则表达式将匹配colorText中的这些序列:

  • red
  • blue
  • green
  • yellow

and the split function will split colorText at those sequences. 并且split函数将按这些顺序分割colorText

However, if you want to split your string at each comma, use this: 但是,如果要在每个逗号处分割字符串,请使用以下命令:

colors = colorText.split(',');

If you like to prevent empty items on splitting, you could use String#match instead of String#split and a regular expression which matches all characters except commas. 如果要防止拆分时出现空项目,可以使用String#match而不是String#split和正则表达式来匹配除逗号以外的所有字符。

 var regex = /[^,]+/g; console.log(",red,blue,green,yellow,".match(regex)); console.log("red,blue,green,yellow".match(regex)); 
 .as-console-wrapper { max-height: 100% !important; top: 0; } 

So, my goal was not to separate the words in the string by comma. 因此,我的目标不是用逗号分隔字符串中的单词。 I found this code in a book and wanted to understand it. 我在一本书中找到了此代码,并想理解它。 The mistake I made was that I thought that the ^ matched the beginning of a string, while in fact it means "anything but" inside of square brackets. 我犯的错误是我认为^匹配字符串的开头,而实际上它表示在方括号内的“除”以外。 Now I understand that the regular expression matches any number of character that is not a comma, and that's what tells split() what to put in each list element. 现在,我知道正则表达式可以匹配任意数量的不是逗号的字符,这就是告诉split()在每个列表元素中放置什么内容的原因。 The first and last elements are empty strings because that's what at the left and right side of the first and last words, respectively. 第一个和最后一个元素是空字符串,因为这分别是第一个和最后一个单词的左侧和右侧。

You have to remove that caret ^ from var colors3 = colorText.split(/[^\\,]+/); 您必须从var colors3 = colorText.split(/[^\\,]+/);删除该插入符号^ var colors3 = colorText.split(/[^\\,]+/); so that it works well: 使其运作良好:

 var colorText = "red,blue,green,yellow"; var colors3 = colorText.split(/[\\,]+/); console.log(colors3); 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM