简体   繁体   English

在javascript中,我无法弄清楚如何使正则表达式只替换捕获而不是匹配

[英]In javascript, I'm having trouble figuring out how to make a regex replace only the capture and not the match

The following function is meant to remove random articles (parts of speech) from the text. 以下功能旨在从文本中删除随机文章(词性)。 Eventually the percentages will be user-adjustable, and the regex more sophisticated to catch word boundaries better, etc. It is replacing (and about 50/50), but it's also squashing the spaces (which are matched, but not captured). 最终,百分比将是用户可调整的,并且正则表达式更加复杂以更好地捕获单词边界等。它正在替换(并且大约50/50),但它也压缩空间(匹配但未捕获)。 I think I'm being really bone-headed here but I can't figure out the proper syntax... can anyone help? 我想我在这里真的很头脑,但我无法弄清楚正确的语法......任何人都可以帮忙吗?

function posArticles(t) {
   var text = t;
   var re = / (a|the|an) /g;    
   var rArray;

   text = text.replace(re, function(_, m) {
       if (Math.floor(Math.random()*101) < 50) return '';
       else return m;
   });

   return text;
}

I realize that this has to do with the positional/optional arguments to the anon function, but I can't figure out which is the match and which is the capture and so forth. 我意识到这与anon函数的位置/可选参数有关,但我无法弄清楚哪个是匹配,哪个是捕获等等。

There are numerous ways you could do this, but I think your best bet is to use \\b – a zero-width match for a "word boundary." 有很多方法可以做到这一点,但我认为最好的办法是使用\\b - “字边界”的零宽度匹配。 That guarantees that you're getting "the" and not "there" or whatever, but doesn't match the spaces around it. 这可以保证你得到“那个”而不是“那里”或者其他什么东西,但是与它周围的空间不匹配。

Thus, use re = /\\b([Aa]n?|[Tt]he)\\b/; 因此,使用re = /\\b([Aa]n?|[Tt]he)\\b/;

I realize that this has to do with the positional/optional arguments to the anon function, but I can't figure out which is the match and which is the capture and so forth. 我意识到这与anon函数的位置/可选参数有关,但我无法弄清楚哪个是匹配,哪个是捕获等等。

First argument passed to your callback function is whole match (ie: _ = ' the ' ). 传递给回调函数的第一个参数是完全匹配(即: _ = ' the ' )。 Next argument are your captured groups ( m = 'the' ). 下一个参数是您捕获的组( m = 'the' )。 Callback function is replacing whole match, so if you are including spaces in your expresion, they will also be replaced. 回调函数正在替换整个匹配,因此如果您在expresion中包含空格,它们也将被替换。

if (Math.floor(Math.random()*101) < 50) return ' ';

返回空格而不是空字符串:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM