简体   繁体   English

如何在 JavaScript 中查找一个字符串在另一个字符串中所有出现的索引?

[英]How to find indices of all occurrences of one string in another in JavaScript?

I'm trying to find the positions of all occurrences of a string in another string, case-insensitive.我试图找到一个字符串在另一个字符串中所有出现的位置,不区分大小写。

For example, given the string:例如,给定字符串:

I learned to play the Ukulele in Lebanon.

and the search string le , I want to obtain the array:和搜索字符串le ,我想获取数组:

[2, 25, 27, 33]

Both strings will be variables - ie, I can't hard-code their values.两个字符串都是变量——即,我不能对它们的值进行硬编码。

I figured that this was an easy task for regular expressions, but after struggling for a while to find one that would work, I've had no luck.我认为这对于正则表达式来说是一项简单的任务,但是在努力寻找可以工作的一段时间之后,我没有走运。

I found this example of how to accomplish this using .indexOf() , but surely there has to be a more concise way to do it?我发现这个例子中如何做到这一点使用.indexOf()但肯定必须有一个更简洁的方式来做到这一点?

var str = "I learned to play the Ukulele in Lebanon."
var regex = /le/gi, result, indices = [];
while ( (result = regex.exec(str)) ) {
    indices.push(result.index);
}

UPDATE更新

I failed to spot in the original question that the search string needs to be a variable.我未能在原始问题中发现搜索字符串需要是一个变量。 I've written another version to deal with this case that uses indexOf , so you're back to where you started.我已经编写了另一个版本来处理这种使用indexOf ,所以你又回到了开始的地方。 As pointed out by Wrikken in the comments, to do this for the general case with regular expressions you would need to escape special regex characters, at which point I think the regex solution becomes more of a headache than it's worth.正如 Wrikken 在评论中指出的那样,要在正则表达式的一般情况下执行此操作,您需要转义特殊的正则表达式字符,此时我认为正则表达式解决方案变得比它的价值更令人头疼。

 function getIndicesOf(searchStr, str, caseSensitive) { var searchStrLen = searchStr.length; if (searchStrLen == 0) { return []; } var startIndex = 0, index, indices = []; if (!caseSensitive) { str = str.toLowerCase(); searchStr = searchStr.toLowerCase(); } while ((index = str.indexOf(searchStr, startIndex)) > -1) { indices.push(index); startIndex = index + searchStrLen; } return indices; } var indices = getIndicesOf("le", "I learned to play the Ukulele in Lebanon."); document.getElementById("output").innerHTML = indices + "";
 <div id="output"></div>

Here is regex free version:这是正则表达式免费版本:

function indexes(source, find) {
  if (!source) {
    return [];
  }
  // if find is empty string return all indexes.
  if (!find) {
    // or shorter arrow function:
    // return source.split('').map((_,i) => i);
    return source.split('').map(function(_, i) { return i; });
  }
  var result = [];
  for (i = 0; i < source.length; ++i) {
    // If you want to search case insensitive use 
    // if (source.substring(i, i + find.length).toLowerCase() == find) {
    if (source.substring(i, i + find.length) == find) {
      result.push(i);
    }
  }
  return result;
}

indexes("I learned to play the Ukulele in Lebanon.", "le")

EDIT : and if you want to match strings like 'aaaa' and 'aa' to find [0, 2] use this version:编辑:如果你想匹配像 'aaaa' 和 'aa' 这样的字符串来找到 [0, 2] 使用这个版本:

function indexes(source, find) {
  if (!source) {
    return [];
  }
  if (!find) {
      return source.split('').map(function(_, i) { return i; });
  }
  var result = [];
  var i = 0;
  while(i < source.length) {
    if (source.substring(i, i + find.length) == find) {
      result.push(i);
      i += find.length;
    } else {
      i++;
    }
  }
  return result;
}

You sure can do this!你一定能做到!

//make a regular expression out of your needle
var needle = 'le'
var re = new RegExp(needle,'gi');
var haystack = 'I learned to play the Ukulele';

var results = new Array();//this is the results you want
while (re.exec(haystack)){
  results.push(re.lastIndex);
}

Edit: learn to spell RegExp编辑:学习拼写 RegExp

Also, I realized this isn't exactly what you want, as lastIndex tells us the end of the needle not the beginning, but it's close - you could push re.lastIndex-needle.length into the results array...此外,我意识到这不正是你想要的,因为lastIndex告诉我们,针不是开始的结束,但已经很接近了-你可能会推re.lastIndex-needle.length到结果数组...

Edit: adding link编辑:添加链接

@Tim Down's answer uses the results object from RegExp.exec(), and all my Javascript resources gloss over its use (apart from giving you the matched string). @Tim Down 的回答使用来自 RegExp.exec() 的结果对象,我所有的 Javascript 资源都掩盖了它的使用(除了给你匹配的字符串)。 So when he uses result.index , that's some sort of unnamed Match Object.所以当他使用result.index ,这是某种未命名的匹配对象。 In the MDC description of exec , they actually describe this object in decent detail.execMDC 描述中,他们实际上详细地描述了这个对象。

One liner using String.protype.matchAll (ES2020):一个使用String.protype.matchAll (ES2020) 的班轮:

[...sourceStr.matchAll(new RegExp(searchStr, 'gi'))].map(a => a.index)

Using your values:使用你的价值观:

const sourceStr = 'I learned to play the Ukulele in Lebanon.';
const searchStr = 'le';
const indexes = [...sourceStr.matchAll(new RegExp(searchStr, 'gi'))].map(a => a.index);
console.log(indexes); // [2, 25, 27, 33]

If you're worried about doing a spread and a map() in one line, I ran it with a for...of loop for a million iterations (using your strings).如果您担心在一行中执行扩展和map() ,我会使用for...of循环运行它进行一百万次迭代(使用您的字符串)。 The one liner averages 1420ms while the for...of averages 1150ms on my machine.一个班轮平均 1420 毫秒,而for...of在我的机器上平均为 1150 毫秒。 That's not an insignificant difference, but the one liner will work fine if you're only doing a handful of matches.这不是一个微不足道的区别,但是如果您只进行少量比赛,则一个衬垫可以正常工作。

See matchAll on caniuse matchAll上caniuse

If you just want to find the position of all matches I'd like to point you to a little hack:如果您只想找到所有匹配项的位置,我想向您指出一个小技巧:

 var haystack = 'I learned to play the Ukulele in Lebanon.', needle = 'le', splitOnFound = haystack.split(needle).map(function (culm) { return this.pos += culm.length + needle.length }, {pos: -needle.length}).slice(0, -1); // {pos: ...} – Object wich is used as this console.log(splitOnFound);

It might not be applikable if you have a RegExp with variable length but for some it might be helpful.如果您有一个长度可变的 RegExp,它可能不适用,但对于某些它可能会有所帮助。

This is case sensitive.这是区分大小写的。 For case insensitivity use String.toLowerCase function before.对于不区分大小写的情况,请先使用String.toLowerCase函数。

I am a bit late to the party (by almost 10 years, 2 months), but one way for future coders is to do it using while loop and indexOf()我参加聚会有点晚了(将近 10 年零 2 个月),但未来编码人员的一种方法是使用 while 循环和indexOf()

let haystack = "I learned to play the Ukulele in Lebanon.";
let needle = "le";
let pos = 0; // Position Ref
let result = []; // Final output of all index's.
let hayStackLower = haystack.toLowerCase();

// Loop to check all occurrences 
while (hayStackLower.indexOf(needle, pos) != -1) {
  result.push(hayStackLower.indexOf(needle , pos));
  pos = hayStackLower.indexOf(needle , pos) + 1;
}

console.log("Final ", result); // Returns all indexes or empty array if not found

Here is a simple code snippet:这是一个简单的代码片段:

 function getIndexOfSubStr(str, searchToken, preIndex, output) { var result = str.match(searchToken); if (result) { output.push(result.index +preIndex); str=str.substring(result.index+searchToken.length); getIndexOfSubStr(str, searchToken, preIndex, output) } return output; } var str = "my name is 'xyz' and my school name is 'xyz' and my area name is 'xyz' "; var searchToken ="my"; var preIndex = 0; console.log(getIndexOfSubStr(str, searchToken, preIndex, []));

I would recommend Tim's answer.我会推荐蒂姆的答案。 However, this comment by @blazs states "Suppose searchStr=aaa and that str=aaaaaa . Then instead of finding 4 occurences your code will find only 2 because you're making skips by searchStr.length in the loop.", which is true by looking at Tim's code, specifically this line here: startIndex = index + searchStrLen;但是,@blazs 的此评论指出“假设searchStr=aaa和那个str=aaaaaa 。然后,您的代码将只找到 2 次,而不是找到 4 次,因为您在循环中通过 searchStr.length 进行了跳过。”,这是真的通过查看 Tim 的代码,特别是这里的这一行: startIndex = index + searchStrLen; Tim's code would not be able to find an instance of the string that's being searched that is within the length of itself. Tim 的代码将无法找到正在搜索的字符串在其自身长度范围内的实例。 So, I've modified Tim's answer:所以,我修改了蒂姆的回答:

 function getIndicesOf(searchStr, str, caseSensitive) { var startIndex = 0, index, indices = []; if (!caseSensitive) { str = str.toLowerCase(); searchStr = searchStr.toLowerCase(); } while ((index = str.indexOf(searchStr, startIndex)) > -1) { indices.push(index); startIndex = index + 1; } return indices; } var searchStr = prompt("Enter a string."); var str = prompt("What do you want to search for in the string?"); var indices = getIndicesOf(str, searchStr); document.getElementById("output").innerHTML = indices + "";
 <div id="output"></div>

Changing it to + 1 instead of + searchStrLen will allow the index 1 to be in the indices array if I have an str of aaaaaa and a searchStr of aaa .如果我有aaaaaa的 str 和aaa的 searchStr ,将其更改为+ 1而不是+ searchStrLen将允许索引 1 位于索引数组中。

PS If anyone would like comments in the code to explain how the code works, please say so, and I'll be happy to respond to the request. PS 如果有人想在代码中添加注释来解释代码的工作原理,请说出来,我很乐意回复请求。

Follow the answer of @jcubic, his solution caused a small confusion for my case按照@jcubic 的回答,他的解决方案对我的情况造成了一些混乱
For example var result = indexes('aaaa', 'aa') will return [0, 1, 2] instead of [0, 2]例如var result = indexes('aaaa', 'aa')将返回[0, 1, 2]而不是[0, 2]
So I updated a bit his solution as below to match my case所以我更新了他的解决方案如下以匹配我的情况

function indexes(text, subText, caseSensitive) {
    var _source = text;
    var _find = subText;
    if (caseSensitive != true) {
        _source = _source.toLowerCase();
        _find = _find.toLowerCase();
    }
    var result = [];
    for (var i = 0; i < _source.length;) {
        if (_source.substring(i, i + _find.length) == _find) {
            result.push(i);
            i += _find.length;  // found a subText, skip to next position
        } else {
            i += 1;
        }
    }
    return result;
}

Thanks for all the replies.感谢所有的答复。 I went through all of them and came up with a function that gives the first an last index of each occurrence of the 'needle' substring .我浏览了所有这些,并提出了一个函数,该函数为每个出现的 'needle' substring 提供第一个最后一个索引。 I am posting it here in case it will help someone.我把它贴在这里以防它对某人有帮助。

Please note, it is not the same as the original request for only the beginning of each occurrence.请注意,它与仅在每次出现的开头的原始请求不同。 It suits my usecase better because you don't need to keep the needle length.它更适合我的用例,因为您不需要保持针的长度。

function findRegexIndices(text, needle, caseSensitive){
  var needleLen = needle.length,
    reg = new RegExp(needle, caseSensitive ? 'gi' : 'g'),
    indices = [],
    result;

  while ( (result = reg.exec(text)) ) {
    indices.push([result.index, result.index + needleLen]);
  }
  return indices
}

Check this solution which will able to find same character string too, let me know if something missing or not right.检查此解决方案,它也可以找到相同的字符串,如果缺少某些内容或不正确,请告诉我。

 function indexes(source, find) { if (!source) { return []; } if (!find) { return source.split('').map(function(_, i) { return i; }); } source = source.toLowerCase(); find = find.toLowerCase(); var result = []; var i = 0; while(i < source.length) { if (source.substring(i, i + find.length) == find) result.push(i++); else i++ } return result; } console.log(indexes('aaaaaaaa', 'aaaaaa')) console.log(indexes('aeeaaaaadjfhfnaaaaadjddjaa', 'aaaa')) console.log(indexes('wordgoodwordgoodgoodbestword', 'wordgood')) console.log(indexes('I learned to play the Ukulele in Lebanon.', 'le'))

Here's my code (using search and slice methods)这是我的代码(使用搜索和切片方法)

 let s = "I learned to play the Ukulele in Lebanon" let sub = 0 let matchingIndex = [] let index = s.search(/le/i) while( index >= 0 ){ matchingIndex.push(index+sub); sub = sub + ( s.length - s.slice( index+1 ).length ) s = s.slice( index+1 ) index = s.search(/le/i) } console.log(matchingIndex)

This is what I usually use to get a string index also according to its position.这也是我通常用来根据位置获取字符串索引的方法。

I pass following parameters:我传递以下参数:

search : the string where to search for search : 要搜索的字符串

find : the string to find find : 要查找的字符串

position ('all' by default): the position by which the find string appears in search string position (默认为'all'):查找字符串在搜索字符串中出现的位置

(if 'all' it returns the complete array of indexes) (如果 'all' 则返回完整的索引数组)

(if 'last' it returns the last position) (如果 'last' 它返回最后一个位置)

function stringIndex (search, find, position = "all") {
    
    var currIndex = 0, indexes = [], found = true;
    
    while (found) {        
        var searchIndex = search.indexOf(find);
        if (searchIndex > -1) {
            currIndex += searchIndex + find.length; 
            search = search.substr (searchIndex + find.length);
            indexes.push (currIndex - find.length);
        } else found = false; //no other string to search for - exit from while loop   
    }
    
    if (position == 'all') return indexes;
    if (position > indexes.length -1) return [];
    
    position = (position == "last") ? indexes.length -1 : position;
    
    return indexes[position];        
}

//Example:
    
var myString = "Joe meets Joe and together they go to Joe's house";
console.log ( stringIndex(myString, "Joe") ); //0, 10, 38
console.log ( stringIndex(myString, "Joe", 1) ); //10
console.log ( stringIndex(myString, "Joe", "last") ); //38
console.log ( stringIndex(myString, "Joe", 5) ); //[]

Hi friends this is just another way of finding indexes of matching phrase using reduce and a helper method.嗨,朋友,这只是使用 reduce 和辅助方法查找匹配短语索引的另一种方法。 Of course RegExp is more convenient and perhaps is internally implemented somehow like this.当然,RegExp 更方便,而且可能以某种方式在内部实现。 I hope you find it useful.希望对你有帮助。

 function findIndexesOfPhraseWithReduce(text, phrase) { //convert text to array so that be able to manipulate. const arrayOfText = [...text]; /* this function takes the array of characters and the search phrase and start index which comes from reduce method and calculates the end with length of the given phrase then slices and joins characters and compare it whith phrase. and returns True Or False */ function isMatch(array, phrase, start) { const end = start + phrase.length; return (array.slice(start, end).join('')).toLowerCase() === phrase.toLowerCase(); } /* here we reduce the array of characters and test each character with isMach function which takes "current index" and matches the phrase with the subsequent character which starts from current index and ends at the last character of phrase(the length of phrase). */ return arrayOfText.reduce((acc, item, index) => isMatch(arrayOfText, phrase, index) ? [...acc, index] : acc, []); } findIndexesOfPhraseWithReduce("I learned to play the Ukulele in Lebanon.", "le"); 

 function findIndexesOfPhraseWithReduce(text, phrase) { const arrayOfText = [...text]; function isMatch(array, phrase, start) { const end = start + phrase.length; return (array.slice(start, end).join('')).toLowerCase() === phrase.toLowerCase(); } return arrayOfText.reduce((acc, item, index) => isMatch(arrayOfText, phrase, index) ? [...acc, index] : acc, []); } console.log(findIndexesOfPhraseWithReduce("I learned to play the Ukulele in Lebanon.", "le"));

const findAllOccurrences = (str, substr) => {
  str = str.toLowerCase();
  
  let result = [];

  let idx = str.indexOf(substr)
  
  while (idx !== -1) {
    result.push(idx);
    idx = str.indexOf(substr, idx+1);
  }
  return result;
}

console.log(findAllOccurrences('I learned to play the Ukulele in Lebanon', 'le'));
function countInString(searchFor,searchIn){

 var results=0;
 var a=searchIn.indexOf(searchFor)

 while(a!=-1){
   searchIn=searchIn.slice(a*1+searchFor.length);
   results++;
   a=searchIn.indexOf(searchFor);
 }

return results;

}

the below code will do the job for you :下面的代码将为您完成这项工作:

function indexes(source, find) {
  var result = [];
  for(i=0;i<str.length; ++i) {
    // If you want to search case insensitive use 
    // if (source.substring(i, i + find.length).toLowerCase() == find) {
    if (source.substring(i, i + find.length) == find) {
      result.push(i);
    }
  }
  return result;
}

indexes("hello, how are you", "ar")

Use String.prototype.match .使用String.prototype.match

Here is an example from the MDN docs itself:这是 MDN 文档本身的一个示例:

var str = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz';
var regexp = /[A-E]/gi;
var matches_array = str.match(regexp);

console.log(matches_array);
// ['A', 'B', 'C', 'D', 'E', 'a', 'b', 'c', 'd', 'e']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM