简体   繁体   English

仅匹配给定字符串中单词序列中的字符

[英]Matching only characters in sequence of a word from a given string

I am trying to find a closest match for a word by giving a specific string, for example: 我试图通过给出一个特定的字符串来找到最接近的单词匹配,例如:

so I would have: 所以我会:

"jonston" x "john"  => "jo" //only "jo" is the part that matches
"joshua" x "john" => "jo" 
"mark" x "marta"    => "mar"

as you can see I only would like to retrieve the characters in sequence matching, that's why joshua and john only would have jo in common sequence and not joh since both have the letter h 你可以看到,我只希望在序列匹配检索的特点,这就是为什么joshuajohn只会有jo共同序列,而不是joh因为两者有信h

I've tried that with regular expression by using the following: 我通过使用以下方法尝试使用正则表达式:

"john".match(/["joshua"]+/) //=> outputs ["joh"] and not ["jo"]

is there any way I could match only the first chars that match? 有什么方法我只匹配匹配的第一个字符?

I will be using javascript for the implementation 我将使用javascript进行实现

I hope that makes sense 我希望这是有道理的

Thanks in advance 提前致谢

var a = "john";
var b = "joshua";
var x = "";

for (var i = 0; i < a.length; i++) {
    if (x == "" && i > 0) break;
    else if (a[i] == b[i]) x += a[i];
    else if (x != "") break;
}

console.log(x);

DEMO: http://jsfiddle.net/jMuDm/ 演示: http //jsfiddle.net/jMuDm/

Yet another solution: 又一个解决方案:

if(typeof String.prototype.commonFirstChars !== 'function') {
    String.prototype.commonFirstChars = function(s) {
        var common = "";
        for(var i=0; i<this.length; i++) {
            if(this[i] !== s[i]) {
                return common;
            }
            common += this[i];           
        }
    };
}

You can use it like this: 你可以像这样使用它:

var commonFirstChars = "john".commonFirstChars("joshua");
// "john".commonFirstChars("joshua") === "joshua".commonFirstChars("john")

This will return: 这将返回:

jo

initLCS = function(a, b) {
    for (var i = 0; i < a.length && a[i] == b[i]; i++);
    return a.substr(0, i);
}


initLCS("jonston", "john") // jo
initLCS("jonston", "j111") // j
initLCS("xx", "yy") // ""

If you insist on using regular expressions, it goes like this: 如果你坚持使用正则表达式,它会像这样:

initLCS = function(a, b) {

    function makeRe(x) {
        return x.length ? "(" + x.shift() + makeRe(x) + ")?" : "";
    }

    var re = new RegExp('^' + makeRe(b.split("")), "g");
    return a.match(re)[0];
}

This creates an expression like /^(j(o(h(n)?)?)?)?/g from the second string and applies it to the first one. 这将从第二个字符串创建一个类似/^(j(o(h(n)?)?)?)?/g的表达式,并将其应用于第一个字符串。 Not that it makes much sense, just for the heck of it. 并不是说它有多大意义,只是为了它。

You can not really do this with regex. 你不能用正则表达式真正做到这一点。 Why dont you just loop through both string and compare the indexes? 为什么不直接遍历两个字符串并比较索引? You can select the chars until you hit a char at the same index with a different value. 您可以选择字符,直到您使用不同的值在同一索引处键入字符。

I'd do this in a recursive function like this: 我会在这样的递归函数中执行此操作:

EDIT: Updated example to make it more readable. 编辑:更新了示例,使其更具可读性。

var testWords = [
    ['ted', 'terminator'],
    ['joe', 'john'],
    ['foo', 'bar']
];

var matches = testWords.map(function(wordPair) {
    return (function matchChars(word1, word2, matches) {
        if (word1[0] !== word2[0]) { 
            return [wordPair[0], wordPair[1], matches];
        }

        matches = matches || '';
        matches += word1[0];
        return matchChars(word1.slice(1), word2.slice(1), matches);
    }(wordPair[0], wordPair[1]));
});


console.log(matches.map(function(match) { return match.join(', '); }).join('\n'));
​

Fiddle (updated): http://jsfiddle.net/VU5QT/2/ 小提琴(更新): http //jsfiddle.net/VU5QT/2/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM