[英]Joining two collections efficiently?
I have run into an issue where I am trying to join two arrays similar to the ones below: 我遇到了一个问题,试图将两个类似于以下数组的数组连接起来:
var participants = [
{id: 1, name: "abe"},
{id:2, name:"joe"}
];
var results = [
[
{question: 6, participantId: 1, answer:"test1"},
{question: 6, participantId: 2, answer:"test2"}
],
[
{question: 7, participantId: 1, answer:"test1"},
{question: 7, participantId: 2, answer:"test2"}
]
];
Using nested loops: 使用嵌套循环:
_.each(participants, function(participant) {
var row, rowIndex;
row = [];
var rowIndex = 2
return _.each(results, function(result) {
return _.each(result, function(subResult) {
var data;
data = _.find(subResult, function(part) {
return part.participantId === participant.id;
});
row[rowIndex] = data.answer;
return rowIndex++;
});
});
});
This works ok as long as the arrays are small, but once they get larger I am getting huge performance problems. 只要阵列很小,这种方法就可以,但是一旦阵列变大,我就会遇到巨大的性能问题。 Is there a faster way to combine two arrays in this way?
有没有以这种方式组合两个阵列的更快方法?
This is a slimmed down version of my real dataset/code. 这是我的真实数据集/代码的精简版。 Please let me know if anything doesn't make sense.
请让我知道是否没有任何意义。
FYI FYI
My end goal is to create a collection of rows for each participant containing their answers. 我的最终目标是为每个参与者创建包含他们答案的行集合。 Something like:
就像是:
[
["abe","test1","test1"],
["joe","test2","test2"]
]
The perf* is not from the for loops so you can change them to _ iteration if they gross you out 性能*不是来自for循环,因此如果您累死了,可以将它们更改为_迭代
var o = Object.create(null);
for( var i = 0, len = participants.length; i < len; ++i ) {
o[participants[i].id] = [participants[i].name];
}
for( var i = 0, len = results.length; i < len; ++i ) {
var innerResult = results[i];
for( var j = 0, len2 = innerResult.length; j < len2; ++j) {
o[innerResult[j].participantId].push(innerResult[j].answer);
}
}
//The rows are in o but you can get an array of course if you want:
var result = [];
for( var key in o ) {
result.push(o[key]);
}
*Well if _ uses native .forEach then that's easily order of magnitude slower than for loop but still your problem is 4 nested loops right now so you might not even need the additional 10x after fixing that. *好吧,如果_使用本机.forEach,那么它比for循环慢几个数量级,但是您的问题是现在有4个嵌套循环,因此修复该问题后,您甚至不需要额外的10倍。
Here is a solution using ECMA5 methods 这是使用ECMA5方法的解决方案
Javascript 使用Javascript
var makeRows1 = (function () {
"use strict";
function reduceParticipants(previous, participant) {
previous[participant.id] = [participant.name];
return previous;
}
function reduceResult(previous, subResult) {
previous[subResult.participantId].push(subResult.answer);
return previous;
}
function filterParticipants(participant) {
return participant;
}
return function (participants, results) {
var row = participants.reduce(reduceParticipants, []);
results.forEach(function (result) {
result.reduce(reduceResult, row);
});
return row.filter(filterParticipants);
};
}());
This will not be as fast as using raw for
loops, like @Esailija answer, but it's not as slow as you may think. 这不会像使用@for循环那样使用raw
for
循环那样快,但是却没有您想象的那么慢。 It's certainly faster than using Underscore
, like your example or the answer given by @Maroshii 这当然比使用更快的
Underscore
,喜欢你的例子或@Maroshii给出的答案
Anyway, here is a jsFiddle of all three answers that demonstrates that they all give the same result. 无论如何,这是所有三个答案的jsFiddle ,表明它们都给出了相同的结果。 It uses quite a large data set, I don't know it compares to the size you are using.
它使用了相当大的数据集,我不知道它与您使用的大小相比。 The data is generated with the following:
数据通过以下方式生成:
Javascript 使用Javascript
function makeName() {
var text = "",
possible = "abcdefghijklmnopqrstuvwxy",
i;
for (i = 0; i < 5; i += 1) {
text += possible.charAt(Math.floor(Math.random() * possible.length));
}
return text;
}
var count,
count2,
index,
index2,
participants = [],
results = [];
for (index = 0, count = 1000; index < count; index += 4) {
participants.push({
id: index,
name: makeName()
});
}
for (index = 0, count = 1000; index < count; index += 1) {
results[index] = [];
for (index2 = 0, count2 = participants.length; index2 < count2; index2 += 1) {
results[index].push({
question: index,
participantId: participants[index2].id,
answer: "test" + index
});
}
}
Finally, we have a jsperf that compares these three methods, run on the generated data set. 最后,我们有一个jsperf可以比较这三种方法,并在生成的数据集上运行。
Haven't tested it with large amounts of data but here's an approach: 尚未对大量数据进行测试,但这是一种方法:
var groups = _.groupBy(_.flatten(results),'participantId');
var result =_.reduce(groups,function(memo,group) {
var user = _.find(participants,function(p) { return p.id === group[0].participantId; });
var arr = _.pluck(group,'answer');
arr.unshift(user.name);
memo.push(arr);
return memo ;
},[]);
The amounts of groups would be the amount of arrays that you'll have so then iterating over that with not grow exponentially as if you call _.each(_.each(_.each
which can be quite expensive. 组的数量将是您将拥有的数组的数量,因此对其进行迭代而不会以指数方式增长,就好像您调用
_.each(_.each(_.each
可能会非常昂贵。
Again, should be tested. 再次,应该进行测试。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.