简体   繁体   English

javascript正则表达式 - 删除注释

[英]javascript regular expression - removing comments

The example is from eloquent javascript book.Although There's a little explanation in the book it's really hard to follow, can anyone explain it from beginner perspective.I am having hard time to follow which slash is for what. 这个例子来自雄辩的javascript书。虽然书中有一点解释但很难理解,任何人都可以从初学者的角度来解释它。我很难找到什么斜杠。

function stripComments(code) {
  return code.replace(/\/\/.*|\/\*[^]*\*\//g, "");
}

Comment can have two forms: 评论可以有两种形式:

// this is a comment
/* this is a comment */

Unfortunately, both / and * are special characters in regular expressions, so they must be escaped. 不幸的是, /*都是正则表达式中的特殊字符,因此必须对它们进行转义。

So we start with an empty match expression 所以我们从空匹配表达式开始

//g

We set it to match the first form, // followed by any number of characters, which would be //.* but the slashes have to be escaped 我们将它设置为匹配第一个表单, //后跟任意数量的字符,这将是//.*但是斜杠必须被转义

/\/\/.*/g

The other form, /* followed by anything followed by */ is /*[^]**/ but we have to escape the literal slashes and asterisks 另一种形式, /*后跟任何后跟*//*[^]**/但我们必须转义文字斜杠和星号

\/\*[^]*\*\/

The two forms are then combined with a | 然后将这两种形式与|组合 character which denotes a "or": 表示“或”的字符:

\/\/.*|\/\*[^]*\*\/

and inserted into the empty regex 并插入空的正则表达式

/\/\/.*|\/\*[^]*\*\//g

First and last slashes are delimiters. 第一个和最后一个斜杠是分隔符。

g at the end is a modifier (Modifiers are used to perform case-insensitive and global searches) and performs a global match (find all matches rather than stopping after the first match). g最后是一个修饰符(修饰符用于执行不区分大小写和全局搜索)并执行全局匹配(查找所有匹配而不是在第一次匹配后停止)。

| means OR . 是指OR

\\/\\/.* has some escaped chars and can be translated as // followed by any characters \\/\\*[^]*\\*\\/ has also some escaped chars and can be translated as /*any characters*/ \\/\\/.*有一些转义字符,可以翻译为// followed by any characters \\/\\*[^]*\\*\\/也有一些转义字符,可以翻译为/*any characters*/

Note : both / and * must be escaped because they are used by regex syntax (special characters). 注意 :必须转义/*因为它们由正则表达式语法(特殊字符)使用。 So \\/ means / and \\* means * while .* means any characters (0 or more times) 所以\\/ means /\\*表示* while .*表示任何字符(0次或更多次)

Since the goal of your code is to remove comments, all comments like // xxxx or /* xxx */ are replaced by empty string 由于代码的目标是删除注释,因此// xxxx/* xxx */等所有注释都将替换为空字符串

/ --> start of regex / - >正则表达式的开头

/ --> escaped "/" character / - >转义“/”字符

/ --> escaped "/" character / - >转义“/”字符

.* --> any character (even empty) --> here is the case // abck 。* - >任何字符(甚至是空的) - >这是// abck的情况

| | --> OR - >或

/ --> escaped "/" character / - >转义“/”字符

* --> escaped "*" character * - >转义为“*”字符

[^]* --> any character (multiline, so even \\n\\r) [^] * - >任何字符(多行,所以偶数\\ n \\ r)

* --> escaped "*" character * - >转义为“*”字符

/ --> escaped "/" character --> here is the case /* aasd\\nasdasd */ / - >转义“/”字符 - >这里是/ * aasd \\ nasdasd * /

/ --> end of regex / - >正则表达式结束

g --> global modifier g - >全局修饰符

Let's break it down with one token per line: 让我们用每行一个标记将其分解:

/    # Start a new regex

# This group of tokens matches comments in the form:
# // this is a comment

\/   # An escaped forward slash
\/   # An escaped forward slash
.*   # Any character, zero or more times

|    # OR. This means "match either the previous or the next group of tokens".

# This group of tokens matches comments in the form:
# /* 
#  This is a comment, which could include some new lines
# */

\/   # An escaped forward dlash
\*   # An escaped asterisk
[^]* # A newline, zero or more times
\*   # An escaped asterisk
\/   # An escaped forward slash

/    # Finish the current regex.
g    # This regex can match multiple times against a given input

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM