简体   繁体   English

正则表达式匹配部分单词(JavaScript)

[英]Regex to match partial words (JavaScript)

I would like to craft a case-insensitive regex (for JavaScript) that matches street names, even if each word has been abbreviated. 我想制作一个与街道名称匹配的不区分大小写的正则表达式(对于JavaScript),即使每个单词都已缩写。 For example: 例如:

n univ av should match N Univ ersity Av e n univ av应该匹配N Univ ersity Av e

king blv should match Martin Luther King Jr. Blv d 国王blv应该匹配Martin Luther King Jr. Blv d

ne 9th should match both NE 9th St and 9th St NE ne 9th应该匹配NE 9th St和9th St NE

Bonus points (JK) for a "replace" regex that wraps the matched text with <b> tags. 使用<b>标签包装匹配文本的“替换”正则表达式的加分点(JK)。

You got: 你得到了:

"n univ av"

You want: 你要:

"\bn.*\buniv.*\bav.*"

So you do: 所以你也是:

var regex = new RegExp("n univ av".replace(/(\S+)/g, function(s) { return "\\b" + s + ".*" }).replace(/\s+/g, ''), "gi");

Voilà! 瞧!

But I'm not done, I want my bonus points. 但我没有完成,我想要我的奖励积分。 So we change the pattern to: 所以我们将模式更改为:

var regex = new RegExp("n univ av".replace(/(\S+)/g, function(s) { return "\\b(" + s + ")(.*)" }).replace(/\s+/g, ''), "gi");

And then: 接着:

var matches = regex.exec("N University Ave");

Now we got: 现在我们得到了:

  • matches[0] => the entire expression (useless) 匹配[0] =>整个表达式(无用)
  • matches[odd] => one of our matches 匹配[odd] =>我们的一场比赛
  • matches[even] => additional text not on the original match string 匹配[even] =>不在原始匹配字符串上的其他文本

So, we can write: 所以,我们可以写:

var result = '';
for (var i=1; i < matches.length; i++)
{
  if (i % 2 == 1)
    result += '<b>' + matches[i] + '</b>';
  else
    result += matches[i];
}
function highlightPartial(subject, search) {
  var special = /([?!.\\|{}\[\]])/g;
  var spaces  = /^\s+|\s+/g;
  var parts   = search.split(" ").map(function(s) { 
    return "\b" + s.replace(spaces, "").replace(special, "\\$1");
  });
  var re = new RegExp("(" + parts.join("|") + ")", "gi");
  subject = subject.replace(re, function(match, text) {
    return "<b>" + text + "</b>";
  });
  return subject;
}

var result = highlightPartial("N University Ave", "n univ av");
// ==> "<b>N</b> <b>Univ</b>ersity <b>Av</b>e"

Side note - this implementation does not pay attention to match order, so: 附注 - 此实现不关注匹配顺序,因此:

var result = highlightPartial("N University Ave", "av univ n");
// ==> "<b>N</b> <b>Univ</b>ersity <b>Av</b>e"

If that's a problem, a more elaborate sequential approach would become necessary, something that I have avoided here by using a replace() callback function. 如果这是一个问题,那么需要一个更精细的顺序方法,这是我通过使用replace()回调函数避免的。

Simple: 简单:

var pattern = "n univ av".replace(/\s+/, "|");
var rx      = new RegExp(pattern, "gi");
var matches = rx.Matches("N University Ave");

Or something along these lines. 或者沿着这些方向的东西。

If these are your search terms: 如果这些是您的搜索字词:

  1. n univ av n univ av
  2. king blv 国王blv
  3. ne 9th 9号

It sounds like your algorithm should be something like this 听起来你的算法应该是这样的

  1. split search by space (results in search terms array) input.split(/\\s+/) 按空格分割搜索(结果搜索条件数组) input.split(/\\s+/)
  2. attempt to match each term within your input. 尝试匹配输入中的每个术语。 /term/i
  3. for each matched input, replace each term with the term wrapped in <b> tags. 对于每个匹配的输入,将每个术语替换为包含在<b>标记中的术语。 input.replace(/(term)/gi, "<b>\\$1</b>")

Note: You'll probably want to take precaution to escape regex metacharacters. 注意:您可能希望采取预防措施来逃避正则表达式元字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM