简体   繁体   English

是否不可能在 JavaScript 中使用多行正则表达式模式指示输入字符串的开头?

[英]Is it impossible to indicate the start of the input string using a multiline regex pattern in JavaScript?

In JavaScript, the multiline flag changes the meaning of ^ and $ :在 JavaScript 中, 多行标志改变了^$的含义:

Treat beginning and end characters (^ and $) as working over multiple lines (ie, match the beginning or end of each line (delimited by \\n or \\r), not only the very beginning or end of the whole input string)将开始和结束字符(^ 和 $)视为在多行上工作(即匹配每行的开头或结尾(以 \\n 或 \\r 分隔),而不仅仅是整个输入字符串的开头或结尾)

Well, with those characters out of the picture, are there any other ways to mark the start of the input string?那么,这些字符不在图片中,还有其他方法可以标记输入字符串的开头吗? How can I indicate that I want my multiline pattern to only match at the start of the input string?如何指示我希望我的多行模式在输入字符串的开头匹配?

I know I can check the index of the match using the index property of the return value of exec , but is there a way to prevent the regex engine from searching the entire string in the first place?我知道我可以使用exec返回值的index属性检查匹配的index ,但是有没有办法阻止正则表达式引擎首先搜索整个字符串?

No, ^ is the only built-in assertion that tests for beginning-of-input and, as you said, its behavior is modified by the multiline flag.不, ^是测试输入开始的唯一内置断言,正如您所说,它的行为由multiline标志修改。

You could put some unique string at the beginning of the string you're testing and also at the beginning of your regular expression, but that's a bit of a hack.您可以在要测试的字符串的开头以及正则表达式的开头放置一些唯一的字符串,但这有点小技巧。 In the normal case, as you said, you'd test the index of the returned match.在正常情况下,如您所说,您将测试返回匹配项的索引。

No, JavaScript doesn't support the absolute anchors ( \\A , \\Z and \\z ) like most other flavors do.不,JavaScript 不像大多数其他风格那样支持绝对锚点( \\A\\Z\\z )。 But are you sure you need them?但是你确定你需要它们吗? The beginning of the string is usually the only place where you must use an anchor.字符串的开头通常是唯一必须使用锚点的地方。 Each subsequent line is automatically "anchored" by the newline preceding it.每个后续行都被它前面的换行符自动“锚定”。

I suggest you drop the multiline flag and make sure you explicitly consume all the newlines.我建议您删除多行标志并确保您明确使用所有换行符。 I know that's kinda vague;我知道这有点含糊; if you were to supply a code sample and/or tell us what problem you're trying to solve, we might be able to do better.如果您要提供代码示例和/或告诉我们您要解决的问题,我们可能会做得更好。

In 2021, with the growing adoption of the ECMAScript 2018+ standard that supports the lookbehind construct in a RegExp , you can now use an "absolute", unambiguous start of string position in the multiline mode regex if you use 2021 年,随着支持RegExp的后视构造的 ECMAScript 2018+ 标准的日益普及,您现在可以在多行模式正则表达式中使用“绝对”、明确的字符串位置开始,如果您使用

/^(?<![^])/gm

Although the g flag allows multiple matching and m allows ^ to match any line start position, the ^(?<![^]) pattern only matches尽管g标志允许多重匹配并且m允许^匹配任何行的起始位置,但^(?<![^])模式只匹配

  • ^ - start of a line that ^ - 一行的开始
  • (?<![^]) - has no char immediately to the left of it. (?<![^]) - 它的左边没有字符。 [^] matches any char in an ECMAScript regex. [^]匹配 ECMAScript 正则表达式中的任何字符。

Alternatives :替代品

/^(?<![\w\W])/gm
/^(?<!.)/gsm

where [\\w\\W] matches any chars including line breaks (a synonymic contruct to [\\s\\S] and [\\d\\D] ), and /s flag makes .其中[\\w\\W]匹配任何字符,包括换行符( [\\s\\S][\\d\\D]的同义结构),并且/s标志使. line break aware.断线意识。

See the JavaScript demo:请参阅 JavaScript 演示:

 const regex = /^(?<![^])/gm; // g - multiple matching enabled! const text = 'Line 1\\nLine 2\\nLine 3'; console.log(text.replace(regex, 'INSERTION...')); // Only the first line affected // => INSERTION...Line 1 // Line 2 // Line 3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM