简体   繁体   English

如何在 JavaScript 中使用两个对象数组执行内部联接?

[英]How can I perform an inner join with two object arrays in JavaScript?

I have two object arrays:我有两个对象数组:

var a = [
  {id: 4, name: 'Greg'},
  {id: 1, name: 'David'},
  {id: 2, name: 'John'},
  {id: 3, name: 'Matt'},
]

var b = [
  {id: 5, name: 'Mathew', position: '1'},
  {id: 6, name: 'Gracia', position: '2'},
  {id: 2, name: 'John', position: '2'},
  {id: 3, name: 'Matt', position: '2'},
]

I want to do an inner join for these two arrays a and b , and create a third array like this (if the position property is not present, then it becomes null):我想对这两个数组ab进行内部连接,并创建这样的第三个数组(如果 position 属性不存在,则它变为 null):

var result = [{
  {id: 4, name: 'Greg', position: null},
  {id: 1, name: 'David', position: null},
  {id: 5, name: 'Mathew', position: '1'},
  {id: 6, name: 'Gracia', position: '2'},
  {id: 2, name: 'John', position: '2'},
  {id: 3, name: 'Matt', position: '2'},
}]

My approach:我的做法:

function innerJoinAB(a,b) {
    a.forEach(function(obj, index) {
        // Search through objects in first loop
        b.forEach(function(obj2,i2){
        // Find objects in 2nd loop
        // if obj1 is present in obj2 then push to result.
        });
    });
}

But the time complexity is O(N^2) .但是时间复杂度是O(N^2) How can I do it in O(N) ?我怎样才能在O(N)做到这一点? My friend told me that we can use reducers and Object.assign .我的朋友告诉我,我们可以使用 reducers 和Object.assign

I'm not able to figure this out.我无法弄清楚这一点。 Please help.请帮忙。

I don't know how reduce would help here, but you could use a Map to accomplish the same task in O(n) :我不知道reduce在这里有什么帮助,但是您可以使用MapO(n)完成相同的任务:

 const a = [ {id: 4, name: 'Greg'}, {id: 1, name: 'David'}, {id: 2, name: 'John'}, {id: 3, name: 'Matt'}]; const b = [ {id: 5, name: 'Mathew', position: '1'}, {id: 6, name: 'Gracia', position: '2'}, {id: 2, name: 'John', position: '2'}, {id: 3, name: 'Matt', position: '2'}]; var m = new Map(); // Insert all entries keyed by ID into the Map, filling in placeholder // 'position' since the Array 'a' lacks 'position' entirely: a.forEach(function(x) { x.position = null; m.set(x.id, x); }); // For values in 'b', insert them if missing, otherwise, update existing values: b.forEach(function(x) { var existing = m.get(x.id); if (existing === undefined) m.set(x.id, x); else Object.assign(existing, x); }); // Extract resulting combined objects from the Map as an Array var result = Array.from(m.values()); console.log(JSON.stringify(result));
 .as-console-wrapper { max-height: 100% !important; top: 0; }

Because Map accesses and updates are O(1) (on average - because of hash collisions and rehashing, it can be longer), this makes O(n+m) (where n and m are the lengths of a and b respectively; the naive solution you gave would be O(n*m) using the same meaning for n and m ).因为Map访问和更新是O(1) (平均而言 - 因为散列冲突和重新散列,它可以更长),这使得O(n+m) (其中nm分别是ab的长度;您给出的天真的解决方案将是O(n*m)nm使用相同的含义)。

One of the ways how to solve it.如何解决它的方法之一。

 const a = [ {id: 4, name: 'Greg'}, {id: 1, name: 'David'}, {id: 2, name: 'John'}, {id: 3, name: 'Matt'}, ]; const b = [ {id: 5, name: 'Mathew', position: '1'}, {id: 6, name: 'Gracia', position: '2'}, {id: 2, name: 'John', position: '2'}, {id: 3, name: 'Matt', position: '2'}, ]; const r = a.filter(({ id: idv }) => b.every(({ id: idc }) => idv !== idc)); const newArr = b.concat(r).map((v) => v.position ? v : { ...v, position: null }); console.log(JSON.stringify(newArr));
 .as-console-wrapper { max-height: 100% !important; top: 0; }

To reduce the time complexity, it is inevitable to use more memory.为了降低时间复杂度,使用更多的内存是不可避免的。

 var a = [ {id: 4, name: 'Greg'}, {id: 1, name: 'David'}, {id: 2, name: 'John'}, {id: 3, name: 'Matt'}, ] var b = [ {id: 5, name: 'Mathew', position: '1'}, {id: 6, name: 'Gracia', position: '2'}, {id: 2, name: 'John', position: '2'}, {id: 3, name: 'Matt', position: '2'}, ] var s = new Set(); var result = []; b.forEach(function(e) { result.push(Object.assign({}, e)); s.add(e.id); }); a.forEach(function(e) { if (!s.has(e.id)) { var temp = Object.assign({}, e); temp.position = null; result.push(temp); } }); console.log(result);

update更新

As @Blindman67 mentioned:"You do not reduce the problems complexity by moving a search into the native code."正如@Blindman67 所提到的:“您不会通过将搜索移到本机代码中来降低问题的复杂性。” I've consulted the ECMAScript® 2016 Language Specification about the internal procedure of Set.prototype.has() and Map.prototype.get() , unfortunately, it seemed that they both iterate through all the elements they have.我已经查阅了ECMAScript® 2016 Language Specification关于Set.prototype.has()Map.prototype.get()的内部过程,不幸的是,它们似乎都遍历了它们拥有的所有元素。

Set.prototype.has ( value )#

The following steps are taken:

    Let S be the this value.
    If Type(S) is not Object, throw a TypeError exception.
    If S does not have a [[SetData]] internal slot, throw a TypeError exception.
    Let entries be the List that is the value of S's [[SetData]] internal slot.
    Repeat for each e that is an element of entries,
        If e is not empty and SameValueZero(e, value) is true, return true.
    Return false. 

http://www.ecma-international.org/ecma-262/7.0/#sec-set.prototype.has http://www.ecma-international.org/ecma-262/7.0/#sec-set.prototype.has

Map.prototype.get ( key )#

The following steps are taken:

    Let M be the this value.
    If Type(M) is not Object, throw a TypeError exception.
    If M does not have a [[MapData]] internal slot, throw a TypeError exception.
    Let entries be the List that is the value of M's [[MapData]] internal slot.
    Repeat for each Record {[[Key]], [[Value]]} p that is an element of entries,
        If p.[[Key]] is not empty and SameValueZero(p.[[Key]], key) is true, return p.[[Value]].
    Return undefined. 

http://www.ecma-international.org/ecma-262/7.0/#sec-map.prototype.get http://www.ecma-international.org/ecma-262/7.0/#sec-map.prototype.get

Perhaps, we can use the Object which can directly access its properties by their names, like the hash table or associative array, for example:也许,我们可以使用可以通过名称直接访问其属性的Object ,例如哈希表或关联数组,例如:

 var a = [ {id: 4, name: 'Greg'}, {id: 1, name: 'David'}, {id: 2, name: 'John'}, {id: 3, name: 'Matt'}, ] var b = [ {id: 5, name: 'Mathew', position: '1'}, {id: 6, name: 'Gracia', position: '2'}, {id: 2, name: 'John', position: '2'}, {id: 3, name: 'Matt', position: '2'}, ] var s = {}; var result = []; b.forEach(function(e) { result.push(Object.assign({}, e)); s[e.id] = true; }); a.forEach(function(e) { if (!s[e.id]) { var temp = Object.assign({}, e); temp.position = null; result.push(temp); } }); console.log(result);

You do not reduce the problems complexity by moving a search into the native code.您不会通过将搜索移动到本机代码中来降低问题的复杂性。 The search must still be done.搜索仍然必须进行。

Also the addition of the need to null a undefined property is one of the many reasons I dislike using null.此外,需要将未定义的属性设为 null 也是我不喜欢使用 null 的众多原因之一。

So without the null the solution would look like所以没有 null 解决方案看起来像

var a = [
  {id: 4, name: 'Greg',position: '7'},
  {id: 1, name: 'David'},
  {id: 2, name: 'John'},
  {id: 3, name: 'Matt'},
]

var b = [
  {id: 5, name: 'Mathew', position: '1'},
  {id: 6, name: 'Gracia', position: '2'},
  {id: 2, name: 'John', position: '2'},
  {id: 3, name: 'Matt', position: '2'},
]


function join (indexName, ...arrays) {
    const map = new Map();
    arrays.forEach((array) => {
        array.forEach((item) => {
            map.set(
                item[indexName],
                Object.assign(item, map.get(item[indexName]))
            );
        })
    })
    return [...map.values()];
}

And is called with并被称为

const joinedArray = join("id", a, b);

To join with a default is a little more complex but should prove handy as it can join any number of arrays and automatically set missing properties to a provided default.加入默认值有点复杂,但应该证明很方便,因为它可以加入任意数量的数组并自动将缺少的属性设置为提供的默认值。

Testing for the defaults is done after the join to save a little time.连接后对默认值进行测试以节省一点时间。

function join (indexName, defaults, ...arrays) {
    const map = new Map();
    arrays.forEach((array) => {
        array.forEach((item) => {
            map.set(
                item[indexName], 
                Object.assign( 
                    item, 
                    map.get(item[indexName])
                )
            );
        })
    })
    return [...map.values()].map(item => Object.assign({}, defaults, item));

}

To use使用

const joinedArray = join("id", {position : null}, a, b);

You could add...你可以加...

    arrays.shift().forEach((item) => {  // first array is a special case.
        map.set(item[indexName], item);
    });

...at the start of the function to save a little time, but I feel it's more elegant without the extra code. ...在函数的开头节省了一点时间,但我觉得没有额外的代码更优雅。

If you drop the null criteria (many in the community are saying using null is bad) then there's a very simple solution如果您放弃null标准(社区中的许多人都说使用 null 不好),那么有一个非常简单的解决方案

let a = [1, 2, 3];
let b = [2, 3, 4];

a.filter(x => b.includes(x)) 

// [2, 3]

Here is an attempt at a more generic version of a join which accepts N objects and merges them based on a primary id key.这是对连接的更通用版本的尝试,它接受 N 个对象并根据主id键合并它们。

If performance is critical, you are better off using a specific version like the one provided by ShadowRanger which doesn't need to dynamically build a list of all property keys.如果性能至关重要,最好使用特定版本,例如 ShadowRanger 提供的版本,它不需要动态构建所有属性键的列表。

This implementation assumes that any missing properties should be set to null and that every object in each input array has the same properties (though properties can differ between arrays)此实现假定任何缺失的属性都应设置为 null,并且每个输入数组中的每个对象都具有相同的属性(尽管数组之间的属性可能不同)

 var a = [ {id: 4, name: 'Greg'}, {id: 1, name: 'David'}, {id: 2, name: 'John'}, {id: 3, name: 'Matt'}, ]; var b = [ {id: 5, name: 'Mathew', position: '1'}, {id: 600, name: 'Gracia', position: '2'}, {id: 2, name: 'John', position: '2'}, {id: 3, name: 'Matt', position: '2'}, ]; console.log(genericJoin(a, b)); function genericJoin(...input) { //Get all possible keys let template = new Set(); input.forEach(arr => { if (arr.length) { Object.keys(arr[0]).forEach(key => { template.add(key); }); } }); // Merge arrays input = input.reduce((a, b) => a.concat(b)); // Merge items with duplicate ids let result = new Map(); input.forEach(item => { result.set(item.id, Object.assign((result.get(item.id) || {}), item)); }); // Convert the map back to an array of objects // and set any missing properties to null return Array.from(result.values(), item => { template.forEach(key => { item[key] = item[key] || null; }); return item; }); }

Here's a generic O(n*m) solution, where n is the number of records and m is the number of keys.这是一个通用的 O(n*m) 解决方案,其中 n 是记录数,m 是键数。 This will only work for valid object keys.这仅适用于有效的对象键。 You can convert any value to base64 and use that if you need to.您可以将任何值转换为 base64,并在需要时使用它。

const join = ( keys, ...lists ) =>
    lists.reduce(
        ( res, list ) => {
            list.forEach( ( record ) => {
                let hasNode = keys.reduce(
                    ( idx, key ) => idx && idx[ record[ key ] ],
                    res[ 0 ].tree
                )
                if( hasNode ) {
                    const i = hasNode.i
                    Object.assign( res[ i ].value, record )
                    res[ i ].found++
                } else {
                    let node = keys.reduce( ( idx, key ) => {
                        if( idx[ record[ key ] ] )
                            return idx[ record[ key ] ]
                        else
                            idx[ record[ key ] ] = {}
                        return idx[ record[ key ] ]
                    }, res[ 0 ].tree )
                    node.i = res[ 0 ].i++
                    res[ node.i ] = {
                        found: 1,
                        value: record
                    }
                }
            } )
            return res
        },
        [ { i: 1, tree: {} } ]
         )
         .slice( 1 )
         .filter( node => node.found === lists.length )
         .map( n => n.value )

join( [ 'id', 'name' ], a, b )

This is essentially the same as Blindman67's answer, except that it adds an index object to identify records to join.这与 Blindman67 的答案基本相同,只是它添加了一个索引对象来标识要加入的记录。 The records are stored in an array and the index stores the position of the record for the given key set and the number of lists it's been found in.记录存储在数组中,索引存储给定键集的记录位置以及在其中找到的列表数。

Each time the same key set is encountered, the node is found in the tree, the element at it's index is updated, and the number of times it's been found is incremented.每次遇到相同的键集时,都会在树中找到节点,更新其索引处的元素,并增加找到它的次数。

finally, the idx object is removed from the array with the slice, any elements that weren't found in each set are removed.最后,idx 对象从带有切片的数组中删除,在每个集合中找不到的任何元素都将被删除。 This makes it an inner join, you could remove this filter and have a full outer join.这使其成为内连接,您可以删除此过滤器并拥有完整的外连接。

finally each element is mapped to it's value, and you have the merged array.最后,每个元素都映射到它的值,并且您有合并的数组。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM