[英]Find when words appear next to each other in an array
I have a huge array of strings (words) that I'm analyzing for patterns. 我正在分析各种模式的字符串(单词)。
I want to create a function to: 我想创建一个函数来:
Given the following array 给定以下数组
let array = ["john", "smith", "says", "that", "a", "lock", "smith", "can", "open", "the", "lock", "unlike", "john", "smith"]
Desired Result: 所需结果:
["john smith", "says", "that", "a", "lock", "smith", "can", "open", "the", "lock", "unlike", "john smith"]
Ideally the function identifies more than just 2-word combinations (ie identifies when the combination of "white", "house", "press", "secretary" are appearing more than once. 理想情况下,该功能不仅可以识别2个单词的组合(即,识别“白色”,“房屋”,“新闻”,“秘书”的组合何时出现不止一次)。
I'm really struggling with the logic to have much to show. 我真的很难在逻辑上展现很多东西。 I've also been looking for a solution in a library like underscore.js without luck. 我也一直在没有运气的情况下在underscore.js之类的库中寻找解决方案。
Build a "dictionary" of a all words and their immediate successor. 建立所有单词及其直接后继者的“词典”。 Then loop through the original array and for each element, check if all dictionary returns match, and if so, combine the words and skip the immediate successor. 然后循环遍历原始数组,并为每个元素检查所有字典是否返回匹配项,如果匹配,则组合单词并跳过直接后继。
var arr = ["john", "smith", "says", "that", "a", "lock", "smith", "can", "open", "the", "lock", "unlike", "john", "smith"]; function combineCommon(arr) { var dictionary = {}; for (var a = 0; a < arr.length - 1; a++) { var A = arr[a]; if (dictionary[A] == void 0) { dictionary[A] = []; } dictionary[A].push(arr[a + 1]); } var res = []; for (var index = 0; index < arr.length; index++) { var element = arr[index]; var pass = false; if (dictionary[element].length > 1) { if (dictionary[element] .some(function(a) { return a != dictionary[element][0]; }) == false) { pass = true; } } if (pass) { res.push(arr[index] + " " + dictionary[element][0]); index++; } else { res.push(arr[index]); } } return res; } console.log(combineCommon(arr));
You could count the pairs and check for pairs when reassembling the result. 重新组合结果时,您可以计算对并检查对。
var array = ["john", "smith", "says", "that", "a", "lock", "foo", "bar", "baz", "smith", "can", "open", "foo", "bar", "baz", "the", "lock", "unlike", "john", "smith"], count = Object.create(null), result; array.forEach(function (a, i, aa) { var key = aa.slice(i, i + 2).join(' '); count[key] = (count[key] || 0) + 1; }); result = array.reduce(function (r, a, i, aa) { var key = aa.slice(i, i + 2).join(' '); if (count[key] > 1) { a = key; } else if (count[aa.slice(i - 1, i + 1).join(' ')] > 1) { a = []; } return r.concat(a); }, []); console.log(result);
.as-console-wrapper { max-height: 100% !important; top: 0; }
Please check this. 请检查一下。
var data = ["john", "smith", "says", "that", "a", "lock", "smith", "can", "open", "the", "lock", "unlike", "john", "smith"] var result= []; var flag=0; var n=data.length; var k=0; // Outer main for loop. for(var i=0;i<n;i++){ // Get next word. next_word = data[i+1]; flag=0; // Inner for loop. for(var j=0;j<n;j++){ // john == john && smith == smith // smith == john && smith == smith // .. // .. if(data[j]==data[i] && data[j+1]==next_word){ flag++; temp_word = data[i]+' '+next_word; } } // If flag more than 1 that means same word sequence found more than one time. if(flag>1){ result[k++]=temp_word; // Assign temp_word to result array. i++; // increase outer loop by one so double entry we can restrict. }else{ // If no sequence found then pass outer value to result value as it is. result[k++]=data[i]; } } console.log(result);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.