[英]How should I represent data for effective searching and comparing strings
I have two array with length 300. They look like this (JSON representation): 我有两个长度为300的数组。它们看起来像这样(JSON表示):
[
[
["word1",0.000199],
["word2",0.000102],
...
["word15",0.000102]
],
...
[
["anotherword1",0.0032199],
["anotherword2",0.032302],
...
["anotherword15",0.0320102]
]
]
And I have this bruteforce algorithm: 我有这个强力算法:
for(var i = 0; i < 300; i++)
{
for(var j = 0; j < 15; j++)
{
for(var ii = i + 1; ii < 300; ii++)
{
for(var jj = 0; jj < 15; jj++)
{
for(var jjj = 0; jjj < 15; jjj++)
{
if(new_keywords[i][j][0] === new_keywords[ii][jj][0] && new_keywords[ii][jj][0] === state_keywords[i][jjj][0])
{
console.log(0);
}
}
}
}
}
}
I need to search for same words in those arrays and if words are the same, then I sum values and divide sum by 3 and replace that value in state_keywords array. 我需要在这些数组中搜索相同的单词,如果单词是相同的,那么我将值加总并除以3并在state_keywords数组中替换该值。 So for each word which is more then once in array I have means of its values.
因此,对于每个在数组中超过一次的单词,我都有其值的含义。
Now... my approach is very bad because I have now about 300 mil iterations and that is crazy. 现在......我的方法非常糟糕,因为我现在有大约300万次迭代,这很疯狂。 I need some better implementation of my array in JavaScript.
我需要在JavaScript中更好地实现我的数组。 Something like lexikographical tree or kd-tree or something.
像lexikographical树或kd树或其他东西。
Thank you. 谢谢。
EDIT: 编辑:
Here is http://jsfiddle.net/dD7yB/1/ with example. 这是http://jsfiddle.net/dD7yB/1/的例子。
EDIT2: EDIT2:
I'm sorry if I'm not clear enough. 如果我不够清楚,我很抱歉。 So what exaclty I'm doing:
那么我正在做什么exaclty:
state_keywords
. state_keywords
。 Indexes are from 0 to 299 and they representing a themes
... themes
...... new_keywords
array arrives, they may be different. new_keywords
数组到达时,它们可以是不同的。 state_keywords
array on same theme index. state_keywords
数组中。 And this I need to do as effectively as possbile, because I need to do this every second so it must be FAST. 而且我需要尽可能有效地做,因为我需要每秒都这样做,因此它必须是快速的。
EDIT3: EDIT3:
Now I use this code: 现在我使用这段代码:
var i, j, jj, l;
for(i = 0; i < 300; i++)
{
for(j = 0; j < 15; j++)
{
l = new_keywords[i].length;
for(jj = 0; jj < l; jj++)
{
if(state_keywords[i][j][0] === new_keywords[i][jj][0])
{
state_keywords[i][j][1] = (state_keywords[i][j][1] + new_keywords[i][jj][1]) / 2;
}
}
}
}
which is much faster then the previous one. 这比前一个要快得多。
Why don't you make those arrays into objects with the strings as keys to the values? 为什么不将这些数组作为值作为键的键的对象? Then you can just just look up the words directly and get the values?
然后你可以直接查找单词并获取值?
var wordlists = [
{
"word1":0.000199,
"word2":0.000102,
...
"word15":0.000102
},
...
{
"anotherword1":0.0032199,
"anotherword2":0.032302,
...
"anotherword15":0.0320102
}
]
and then lookup with 然后查找
wordlists[0]["word2"] //0.000102
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.