简体   繁体   English

Javascript多行正则表达式先行或其余字符串/字符串结尾

[英]Javascript multiline regex lookahead OR the rest of string / very end of string

I'm having trouble capturing all groups in the following markdown text: 我无法在以下Markdown文本中捕获所有组:

### AAA
text for group AAA

### BBB
text for group BBB

### CCC
text for group CCC

The regex I'm currently using is: 我当前使用的正则表达式是:

/(^###\s[\s\S]*?(?=^###\s))/gm

Which uses a positive lookahead to know when to cut off each group. 它使用积极的前瞻性来知道何时切断每个组。 As a result though, it always fails to capture the last group (there's no lookahead match). 结果是,它始终无法捕获最后一个组(没有超前匹配)。

The newline characters won't be consistent, so I can't rely on those, and javascript regex doesn't have \\Z (which I believe is "very end of string" in other languages, as opposed to $ which is end of line). 换行符将不一致,因此我不能依靠这些字符,并且javascript regex没有\\ Z(我相信其他语言中的“ Z字符串非常结尾”,而$则是$的结尾)线)。

How can I capture all groups here? 如何在这里捕获所有群组?

You can use negative look-ahead instead of positive : the content of a group will be every character that isn't at the beginning of a ### string. 您可以使用负数前瞻而不是正数:组的内容将是不是###字符串开头的每个字符。

^###\\s((?!###)[\\s\\S])*

Try it on regex101 ! regex101尝试

See if JS has the \\z construct. 查看JS是否具有\\z构造。
If not, you could use a lookahead for not a character (EOS) 如果不是,则可以使用字符前瞻(EOS)
(?![\\S\\s]) inside your existing lookahead. (?![\\S\\s])在现有前瞻中。

(^###[^\\S\\r\\n][\\S\\s]*?(?=^###[^\\S\\r\\n]|(?![\\S\\s])))

Formatted: 格式:

 (                             # (1 start)
      ^ \#\#\# [^\S\r\n] 
      [\S\s]*? 
      (?=
           ^ \#\#\# [^\S\r\n]  
        |  (?! [\S\s] )
      )
 )                             # (1 end)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM