简体   繁体   English

Javascript 正则表达式捕获组不起作用

[英]Javascript regex capture groups not working

I am trying to capture 1 or 2 pieces of information.我正在尝试捕获 1 或 2 条信息。 When using regexr it shows my expression to be working and capturing like it should, but when running it, it only captures from a single string (on the same data as in regexr) and returns null for the rest.使用 regexr 时,它显示我的表达式正在正常工作和捕获,但在运行它时,它仅从单个字符串(与 regexr 中的数据相同)捕获,其余部分返回null

I have tried building the expression here我试过在这里构建表达式

And when switching to JS flavor it shows the capturing groups not working via the color overlays, but it shows them working correctly in the side pane.当切换到 JS 风格时,它显示捕获组无法通过颜色叠加工作,但它会在侧窗格中显示它们工作正常。 Even the simplest capturing group seems to not work.即使是最简单的捕获组似乎也不起作用。

What am I missing?我错过了什么?

Input is :输入是:

<@U0BUPU9QQ> 49
50
<@U0BUPU9QQ>
<@U0BUPU9QQ> noget 49 noget andet tekst 5 40
<@U0BUPU9QQ> noget andet tekst 5 40
<@U0BUPU9QQ|mn> has joined the channel

Output:输出:

Should be the ID inside the <> (except the @ ) and the last group of digits in the line, if there is no ID then only the digits.应该是<>内的 ID( @除外)和行中的最后一组数字,如果没有 ID 则只有数字。

Do not pay attention to the highlighting groups on regex101 for JS: if you see them in the MATCH INFORMATION pane on the right, they are matched and captured correctly.不要注意 regex101 for JS 上的高亮组:如果您在右侧的MATCH INFORMATION窗格中看到它们,则它们已正确匹配和捕获。

In JS, here is the code that will fetch the capture groups (note that m[1] is the first capture group text, m[2] is the second group text, etc.):在 JS 中,这里是获取捕获组的代码(注意m[1]是第一个捕获组文本, m[2]是第二个组文本,以此类推):

 var re = /^(?:<@([A-Z0-9]+)>)?.*\\b([0-9]+)/gm; var str = '<@U0BUPU9QQ> 49\\n50\\n<@U0BUPU9QQ>\\n<@U0BUPU9QQ> noget 49 noget andet tekst 5 40\\n<@U0BUPU9QQ> noget andet tekst 5 40\\n<@U0BUPU9QQ|mn> has joined the channel'; var m; while ((m = re.exec(str)) !== null) { document.write(m[1] + "<br/>" + m[2] + "<br/><br/>"); }

Notes on the regex itself:关于正则表达式本身的注意事项:

  • ^ - Start matching at the beginning of the line (due to m modifier) ^ - 在行首开始匹配(由于m修饰符)
  • (?:<@([A-Z0-9]+)>)? - an optional (due to ? quantifier) group matching - 一个可选的(由于?量词)组匹配
    • <@ - literal <@ symbols <@ - 文字<@符号
    • ([A-Z0-9]+) - (Capture group 1) 1 or more alphanumeric symbols ([A-Z0-9]+) -(捕获组 1)1 个或多个字母数字符号
    • > - closing angle bracket > - 右尖括号
  • .* - 0 or more character other than a newline (as many as possible) .* - 0 个或多个除换行符以外的字符(尽可能多)
  • \\b([0-9]+) - (Capture group 2) 1 or more digits that are preceded by a word boundary \\b([0-9]+) - (Capture group 2) 1 个或多个以单词边界开头的数字

You can adjust the regex as per your requirements.您可以根据您的要求调整正则表达式。 Right now, it will match the ID (=the symbols inside optional <@...> ), and the last digit sequence on a line .现在,它将匹配 ID(=可选<@...>的符号)和line 上的最后一个数字序列 If you need the first digit sequence, use lazy matching .*?如果您需要第一个数字序列,请使用延迟匹配.*? instead of the greedy one ( .* ).而不是贪婪的( .* )。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM