简体   繁体   English

Javascript:如何有效地合并大型对象数组

[英]Javascript: how to merge large arrays of objects efficiently

I have two arrays like this:我有两个这样的数组:

let a = [{id: 1, content: 10},{id: 2, content: 20},{id: 3, content: 30}]
let b = [{id: 1, content: 11},{id: 2, content: 21}]

where a is actually huge (~100k objects) and b is about 50 objects, where all the id s in the objects in b are found in a .其中a实际上很大(~100k 个对象), b大约有 50 个对象,其中 b 中对象中的所有id都在ba I want to merge these arrays together such that if an object in a has a given id and an object in b has that same id, the object in a gets replaced by the object in b .我想将这些数组合并在一起,这样如果 a 中的对象a给定的 id 而b中的对象具有相同的 id,则 a 中的对象ab中的对象替换。 So {id: 1, content: 11} from b would replace {id: 1, content: 10} from a .因此,来自 b 的{id: 1, content: 11}将替换来自b a {id: 1, content: 10} The output in this example would be此示例中的输出将是

[{id: 1, content: 11},{id: 2, content: 21},{id: 3, content: 30}]

where the first two objects got replaced, but the third didn't because it wasn't in b .前两个对象被替换的地方,但第三个没有,因为它不在b中。

What I Tried:我试过的:

let func = (a,b) => {
    let aa = a.reduce((acc,cur) => ({
        ...acc,
        [cur.id]: cur
    }), {})
    let bb = b.reduce((acc,cur) => ({
        ...acc,
        [cur.id]: cur
    }), {})
    return Object.keys({...aa,...bb}).map(key => mergedObj[key])
}

So I convert each array ( a and b ) into an object ( aa and bb ), indexed by its id , then merge the two objects, then convert them back to an array.所以我将每个数组( ab )转换为一个对象( aabb ),由其id索引,然后合并这两个对象,然后将它们转换回数组。

The Problem:问题:

The line let aa = a.reduce... takes a very long time. let aa = a.reduce...这一行花费了很长时间。

Is there a more efficient approach to this problem?有没有更有效的方法来解决这个问题?

From my above comment...从我上面的评论...

"... The performance killer is... making excessive usage of rest parameter, spread syntax and destructuring assignement for each single iteration step. I suggest keeping it as simple as possible, thus a lookup based approach with references only since from the OP's example code there is no need for merging. It is more an updating replacement of existing item references." “......性能杀手是......为每个迭代步骤过度使用rest参数,传播语法和解构分配。我建议尽可能简单,因此基于查找的方法仅来自OP的引用示例代码不需要合并。它更像是对现有项目引用的更新替换。”

And because of that I would choose a lookup based approach where one would...正因为如此,我会选择一种基于查找的方法,其中一个人会......

  1. create an object based lookup from the smaller source data-structure based on an item's id and根据项目的id从较小的源数据结构创建基于对象的查找和

  2. and finally map the larger target structure by looking up whether to update/replace the current item based on such an item's id .最后通过查找是否根据此类项目的id更新/替换当前项目来映射更大的目标结构。

Thus for creating the lookup one would fully iterate the smaller array exactly once, and for creating the updated structure one would fully iterate the larger array, also exactly once, with no other additional costs than the object based lookup which contributes close to nothing.因此,为了创建查找,我们将完全迭代较小的数组恰好一次,而为了创建更新的结构,我们将完全迭代较大的数组,也恰好一次,除了基于对象的查找几乎没有任何其他额外成本外,没有其他额外成本。

And everything would be based on references instead of shallow copies or structured clones.一切都将基于引用而不是浅拷贝或结构化克隆。

In addition one would implement both tasks as function statements in order to take a little bit more advantage of the JIT compiler's runtime optimization.此外,为了更好地利用 JIT 编译器的运行时优化,可以将这两个任务实现为函数语句。

Further code based optimization could be done but should depend on the performance of the hereby provided implementation of the just described approach.可以进行进一步的基于代码的优化,但应该取决于特此提供的刚刚描述的方法的实现的性能。

 function aggregateIdBasedLookup(lookup, item) { lookup[item.id] = item; return lookup; } function updateItemFromBoundLookup(item) { return this[item.id]?? item; } const largeTargetStructure = [ { id: 1, content: 10 }, { id: 2, content: 20 }, { id: 3, content: 30 }, ]; const smallerSourceStructure = [ { id: 1, content: 11 }, { id: 2, content: 21 }, ]; const sourceLookup = smallerSourceStructure.reduce(aggregateIdBasedLookup, Object.create(null)); const largeUpdatedStructure = largeTargetStructure.map(updateItemFromBoundLookup, sourceLookup); console.log({ // largeTargetStructure, // smallerSourceStructure, // sourceLookup, largeUpdatedStructure, });
 .as-console-wrapper { min-height: 100%;important: top; 0; }

You're going to have to iterate over the smaller array no matter what method you use.无论您使用什么方法,您都将不得不遍历较小的数组。 findIndex() returns the index of that element and stops iterating through the larger array. findIndex()返回该元素的索引并停止遍历更大的数组。 So, it seems like the simplest approach is:所以,看起来最简单的方法是:

 let a = [{id: 1, content: 10},{id: 2, content: 20},{id: 3, content: 30}] let b = [{id: 1, content: 11},{id: 2, content: 21}] for (let i = 0; i < b.length; i++) { let thisID = b[i].id; let thisContent = b[i].content; a[a.findIndex(x => x.id === thisID)].content = thisContent; } console.log(a)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM