简体   繁体   English

Javascript 正则表达式循环所有匹配

[英]Javascript Regexp loop all matches

I'm trying to do something similar with stack overflow's rich text editor.我正在尝试用堆栈溢出的富文本编辑器做类似的事情。 Given this text:鉴于此文本:

[Text Example][1]

[1][http://www.example.com]

I want to loop each [string][int] that is found which I do this way:我想循环以这种方式找到的每个[string][int]

 var Text = "[Text Example][1]\n[1][http: //www.example.com]"; // Find resource links var arrMatch = null; var rePattern = new RegExp( "\\[(.+?)\\]\\[([0-9]+)\\]", "gi" ); while (arrMatch = rePattern.exec(Text)) { console.log("ok"); }

This works great, it alerts 'ok' for each [string][int] .这很好用,它会为每个[string][int]发出“ok”警报。 What I need to do though, is for each match found, replace the initial match with components of the second match.不过,我需要做的是,对于找到的每个匹配项,将初始匹配项替换为第二个匹配项的组件。

So in the loop $2 would represent the int part originally matched, and I would run this regexp (pseduo)所以在循环中 $2 将代表最初匹配的 int 部分,我会运行这个正则表达式(pseduo)

while (arrMatch = rePattern.exec(Text)) {
    var FindIndex = $2; // This would be 1 in our example
    new RegExp("\\[" + FindIndex + "\\]\\[(.+?)\\]", "g")

    // Replace original match now with hyperlink
}

This would match这将匹配

[1][http://www.example.com]

End result for first example would be:第一个示例的最终结果是:

<a href="http://www.example.com" rel="nofollow">Text Example</a>

Edit编辑

I've gotten as far as this now:我现在已经做到了:

 var Text = "[Text Example][1]\n[1][http: //www.example.com]"; // Find resource links reg = new RegExp( "\\[(.+?)\\]\\[([0-9]+)\\]", "gi"); var result; while ((result = reg.exec(Text));== null) { var LinkText = result[1]; var Match = result[0]. Text = Text,replace(new RegExp(Match, "g"); '<a href="#">" + LinkText + "</a>'). } console;log(Text);

I agree with Jason that it'd be faster/safer to use an existing Markdown library, but you're looking for String.prototype.replace (also, use RegExp literals:):我同意 Jason 的观点,即使用现有的 Markdown 库会更快/更安全,但您正在寻找String.prototype.replace (另外,使用 RegExp 文字:):

 var Text = "[Text Example][1]\n[1][http: //www.example.com]"; var rePattern = /\[(.+?)\]\[([0-9]+)\]/gi; console.log(Text.replace(rePattern, function(match, text, urlId) { // return an appropriately-formatted link return `<a href="${urlId}">${text}</a>`; }));

I managed to do it in the end with this:最后我设法做到了:

 var Text = "[Text Example][1]\n[1][http: //www.example.com]"; // Find resource links reg = new RegExp( "\\[(.+?)\\]\\[([0-9]+)\\]", "gi"); var result; while (result = reg.exec(Text)) { var LinkText = result[1]; var Match = result[0]; var LinkID = result[2]; var FoundURL = new RegExp("\\[" + LinkID + "\\]\\[(.+?)\\]", "g").exec(Text); Text = Text.replace(Match, '<a href="' + FoundURL[1] + '" rel="nofollow">' + LinkText + '</a>'); } console.log(Text);

Here we're using exec method, it helps to get all matches (with help while loop) and get position of matched string.这里我们使用exec方法,它有助于获取所有匹配项(在 while 循环的帮助下)并获取匹配字符串的 position。

    var input = "A 3 numbers in 333";
    var regExp = /\b(\d+)\b/g, match;
    while (match = regExp.exec(input))
      console.log("Found", match[1], "at", match.index);
    // → Found 3 at 2 //   Found 333 at 15 

Using back-references to to restrict the match so that the code will match if your text is:使用反向引用来限制匹配,以便在您的文本为时代码将匹配:

[Text Example][1]\n[1][http://www.example.com]

and the code will not match if your text is:如果您的文本是:并且代码将不匹配:

[Text Example][1]\n[2][http://www.example.com]

 var re = /\[(.+?)\]\[([0-9]+)\s*.*\s*\[(\2)\]\[(.+?)\]/gi; var str = '[Text Example][1]\n[1][http://www.example.com]'; var subst = '<a href="$4">$1</a>'; var result = str.replace(re, subst); console.log(result);

\number is used in regex to refer a group match number, and $number is used by the replace function in the same way, to refer group results. \number在正则表达式中用于引用组匹配编号,并且$number用于替换 function 以相同的方式引用组结果。

Another way to iterate over all matches without relying on exec and match subtleties, is using the string replace function using the regex as the first parameter and a function as the second one.另一种在不依赖 exec 和 match 细节的情况下迭代所有匹配项的方法是使用字符串替换 function 使用正则表达式作为第一个参数,并使用 function 作为第二个参数。 When used like this, the function argument receives the whole match as the first parameter, the grouped matches as next parameters and the index as the last one:当像这样使用时,function 参数接收整个匹配作为第一个参数,分组匹配作为下一个参数,索引作为最后一个参数:

var text = "[Text Example][1]\n[1][http: //www.example.com]";
// Find resource links
var arrMatch = null;
var rePattern = new RegExp("\\[(.+?)\\]\\[([0-9]+)\\]", "gi");
text.replace(rePattern, function(match, g1, g2, index){
    // Do whatever
})

You can even iterate over all groups of each match using the global JS variable arguments , excluding the first and last ones.您甚至可以使用全局 JS 变量arguments迭代每个匹配的所有组,不包括第一个和最后一个。

This format is based on Markdown .此格式基于Markdown There are several JavaScript ports available.几个 JavaScript 端口可用。 If you don't want the whole syntax, then I recommend stealing the portions related to links.如果您不想要整个语法,那么我建议您窃取与链接相关的部分。

I know it's old, but since I stumble upon this post, I want to strait the things up.我知道它已经过时了,但是由于我偶然发现了这篇文章,我想把事情弄清楚。

First of all, your way of thinking into solving this problem is too complicated, and when the solution of supposedly simple problem becomes too complicated, it is time to stop and think what went wrong.首先,你解决这个问题的思维方式太复杂了,当本应简单的问题的解决方案变得太复杂时,就该停下来想想哪里出了问题。 Second, your solution is super inefficient in a way, that you are first trying to find what you want to replace and then you are trying to search the referenced link information in the same text.其次,您的解决方案在某种程度上效率非常低,您首先尝试查找要替换的内容,然后尝试在同一文本中搜索引用的链接信息。 So calculation complexity eventually becomes O(n^2) .所以计算复杂度最终变成O(n^2)

This is very disappointing to see so many upvotes on something wrong, because people that are coming here, learning mostly from the accepted solution, thinking that this seems be legit answer and using this concept in their project, which then becomes a very badly implemented product.看到这么多对错误的支持感到非常失望,因为来到这里的人主要从公认的解决方案中学习,认为这似乎是合法的答案并在他们的项目中使用这个概念,然后它变成了一个实施得很糟糕的产品.

The approach to this problem is pretty simple.解决这个问题的方法非常简单。 All you need to do, is to find all referenced links in the text, save them as a dictionary and only then search for the placeholders to replace, using the dictionary.您需要做的就是找到文本中所有引用的链接,将它们保存为字典,然后使用字典搜索要替换的占位符。 That's it.而已。 It is so simple!就是这么简单! And in this case you will get complexity of just O(n) .在这种情况下,您将获得O(n)的复杂性。

So this is how it goes:所以事情是这样的:

 const text = ` [2][https://en.wikipedia.org/wiki/Scientific_journal][5][https://en.wikipedia.org/wiki/Herpetology] The Wells and Wellington affair was a dispute about the publication of three papers in the Australian Journal of [Herpetology][5] in 1983 and 1985. The publication was established in 1981 as a [peer-reviewed][1] [scientific journal][2] focusing on the study of [3][https://en.wikipedia.org/wiki/Amphibian][amphibians][3] and [reptiles][4] ([herpetology][5]). Its first two issues were published under the editorship of Richard W. Wells, a first-year biology student at Australia's University of New England. Wells then ceased communicating with the journal's editorial board for two years before suddenly publishing three papers without peer review in the journal in 1983 and 1985. Coauthored by himself and high school teacher Cliff Ross Wellington, the papers reorganized the taxonomy of all of Australia's and New Zealand's [amphibians][3] and [reptiles][4] and proposed over 700 changes to the binomial nomenclature of the region's herpetofauna. [1][https://en.wikipedia.org/wiki/Academic_peer_review] [4][https://en.wikipedia.org/wiki/Reptile] `; const linkRefs = {}; const linkRefPattern = /\[(?<id>\d+)\]\[(?<link>[^\]]+)\]/g; const linkPlaceholderPattern = /\[(?<text>[^\]]+)\]\[(?<refid>\d+)\]/g; const parsedText = text.replace(linkRefPattern, (...[,,,,,ref]) => (linkRefs[ref.id] = ref.link, '')).replace(linkPlaceholderPattern, (...[,,,,,placeholder]) => `<a href="${linkRefs[placeholder.refid]}">${placeholder.text}</a>`).trim(); console.log(parsedText);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM