简体   繁体   English

获得两个数组之间的差异(包括重复)

[英]get difference between two arrays (including duplicates)

I see a lot of posts about how to get the difference and symmetric difference of an array in javascript, but I haven't found anything on how to find the difference, including duplicates. 我看到很多帖子关于如何在javascript中获取数组的差异和对​​称差异,但我还没有找到关于如何找到差异的任何内容,包括重复。

For example: 例如:

let original = [1];
let updated = [1, 1, 2];

difference(updated, original);
// Expect: [1, 2]

Is there an elegant way to do this? 有一种优雅的方式来做到这一点? I'm open to solutions using plain javascript or lodash. 我对使用普通javascript或lodash的解决方案持开放态度。

Thanks! 谢谢!

UPDATE UPDATE

To clarify, an infinite number of duplicates should be supported. 为了澄清,应该支持无限数量的重复。 Another example: 另一个例子:

let original = [1, 1];
let updated = [1, 1, 1, 1, 1, 2];

difference(updated, original);
// Expect: [1, 1, 1, 2]

UPDATE 2 更新2

I realized that there may be some confusion on the original requirements. 我意识到原始要求可能会有些混乱。 It is true that infinite duplicates should be supported, but the order should not affect the output. 确实应该支持无限重复,但顺序不应该影响输出。

Example: 例:

let original = [1, 1, 2];
let updated = [1, 2, 1, 1, 1];

difference(updated, original);
// Expect: [1, 1]

I would suggest this solution, which avoids a time complexity of O(n²) : 我建议这个解决方案,它避免了O(n²)的时间复杂度:

 function difference(a, b) { return [...b.reduce( (acc, v) => acc.set(v, (acc.get(v) || 0) - 1), a.reduce( (acc, v) => acc.set(v, (acc.get(v) || 0) + 1), new Map() ) )].reduce( (acc, [v, count]) => acc.concat(Array(Math.abs(count)).fill(v)), [] ); } let original = [1, 1]; let updated = [1, 1, 1, 1, 1, 2]; let res = difference(updated, original); console.log(res); 

Explanation 说明

This solution creates a Map with a key for every distinct value of the first array ( a ), and as value the count of occurrences of each. 此解决方案创建一个Map其中包含第一个数组( a )的每个不同值的键,以及每个值的出现次数值。 Then b is added to that Map in the same way, except that the count of occurrences counts negative. 然后以相同的方式将b添加到该Map ,除了出现次数为负数。 If that count ends up being zero, then of course this key should not end up in the final result. 如果该计数最终为零,那么当然这个密钥不应该在最终结果中结束。 In fact, the number of occurrences in the final result is the absolute value of the count in the Map for each of its keys. 实际上,最终结果中出现的次数是Map中每个键的计数的绝对值。

Details 细节

The code starts with: 代码以:

new Map()

It is the initial value of the accumulator of the inner reduce . 它是内部reduce的累加器的初始值。 That reduce iterates over a and updates the count of the corresponding key in the Map . reduce迭代a并更新Map相应键的计数。 The final result of this reduce is thus a Map . 因此,这种reduce的最终结果是Map

This Map then becomes the initial accumulator value for the outer reduce . 然后,此Map成为外部reduce的初始累加器值。 That reduce iterates over b and decreases the count in the Map . reduce迭代超过b并减少Map的计数。

This updated Map is spread into an array with the spread operator. 此更新的Map将使用spread运算符扩展为数组。 This array consists of 2-element sub-arrays, which are key/value pairs. 该数组由2个元素的子数组组成,它们是键/值对。 Note that the value in this case is a count which could be positive, zero or negative. 请注意,此情况下的值是可以是正数,零或负数的计数。

This array is then iterated with the final reduce . 然后使用最终的reduce迭代该数组。 Each count is used to create an array of that many elements (in absolute value) of the corresponding value. 每个计数用于创建相应值的许多元​​素(绝对值)的数组。 All this is concatenated to one array, being the return value of the function. 所有这些都连接到一个数组,作为函数的返回值。

Follow-up Question 后续问题

In comments you explained you actually needed something different, where the role of both arrays is not the same. 在评论中,您解释说您实际上需要一些不同的东西,两个阵列的角色都不一样。 The first array should be returned, but with the elements from the second array removed from it. 应该返回第一个数组,但是从第二个数组中删除元素。

You could use this code for that: 您可以使用此代码:

 function difference2(a, b) { return a.filter(function(v) { return !this.get(v) || !this.set(v, this.get(v) - 1); }, b.reduce( (acc, v) => acc.set(v, (acc.get(v) || 0) + 1), new Map() )); } let original = [1, 1, 2]; let updated = [1, 1]; let res = difference2(original, updated); console.log(res); 

 function count(n,arr) { return arr.filter(a=>a==n).length } function diffBetween(arr,arr2) { diff = []; new Set(arr.concat(arr2)).forEach( a => { for(x=0;x<Math.abs(count(a,arr)-count(a,arr2));x++) diff.push(a) } ); return diff; } console.log(diffBetween([1],[1,1,2])); console.log(diffBetween([1,1],[1,1,1,1,1,2])); console.log(diffBetween([1,1,3,4],[1,2,3])); 

How does this work for you? 这对你有什么用?

EDIT: 编辑:

 function difference(a, b) { // trincot's code return [...b.reduce( (acc, v) => acc.set(v, (acc.get(v) || 0) - 1), a.reduce( (acc, v) => acc.set(v, (acc.get(v) || 0) + 1), new Map() ) )].reduce( (acc, [v, count]) => acc.concat(Array(Math.abs(count)).fill(v)), [] ); } function count(n,arr) { // My code return arr.filter(a=>a==n).length } function diffBetween(arr,arr2) { // My code diff = []; new Set(arr.concat(arr2)).forEach( a => { for(x=0;x<Math.abs(count(a,arr)-count(a,arr2));x++) diff.push(a) } ); return diff; } in1 = [1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,2]; in2 = [1,2,3,4,5,6,1,2,3,4,5,6,7,1,1,1,1,1,1,2,2,2,2,2,2,2]; start = (new Date).getTime(); a = difference(in1,in2); end = (new Date).getTime(); console.log("trincot done",end-start,"msec"); start = (new Date).getTime(); a = diffBetween(in1,in2); end = (new Date).getTime(); console.log("stardust done",end-start,"msec"); 

@trincot's solution above is consistently faster in my testing, so his is clearly superior with large enough datasets. @ trincot上面的解决方案在我的测试中总是更快,所以他的数据集足够大,显然更优越。

So, I'd: 所以,我:

  • Iterate the updated array, for each element check if it's present on original array, if it's present I remove it from original array (note: in the function below I copy the original object, so I don't affect it), else I push the element to the differences array. 迭代更新的数组,检查每个元素是否存在于原始数组中,如果它存在我将其从原始数组中删除(注意:在下面的函数中我复制原始对象,所以我不影响它),否则我推差异数组的元素。 At the end, I return the differences array. 最后,我返回差异数组。

This code is made to work on various browsers, thus I didn't use Array().indexOf and other newer methods of ECMAScript. 这段代码适用于各种浏览器,因此我没有使用Array().indexOf和ECMAScript的其他新方法。

function difference(updated, original) {
  var i, l;
  /* copy original array */
  var degradation = [];
  for (var i = 0, ol = original.length; i < ol; ++i)
    degradation[i] = original[i]

  var diff = [];
  for (i = 0, l = Math.max(updated.length, ol); i < l; ++i) {
    var upd = updated[i];
    var index;
    var b, found;
    /* find updated item in degradation */
    for (b = 0, found = false; b < ol; ++b) {
      if (degradation[b] === upd) {
        /* remove item from degradation */
        delete degradation[b];
        found = true;
        break;
      }
    }
    if (!found)
      diff.push(upd);
  }
  return diff;
}
    Array.prototype.Diff = function( secondArray ) {
    var mergedArray = this.concat( secondArray );
    var mergedString = mergedArray.toString();
    var finalArray = new Array();

    for( var i = 0; i < mergedArray.length; i++ ) {
        if(mergedString.match(mergedArray[i])) {
            finalArray.push(mergedArray[i]);
            mergedString = mergedString.replace(new RegExp(mergedArray[i], "g"), "");
        }
    }
    return finalArray;
}

var let = [ 1 ];
var updated = [ 1, 1, 2 ];

console.log(let.Diff(updated));

I like the prototype way. 我喜欢原型方式。 You can save the prototype function above in a JS file and import in any page that you want, the it's possible to use as an embedded function to the object (Array for this case). 您可以将上面的原型函数保存在JS文件中并导入到您想要的任何页面中,它可以用作对象的嵌入式函数(本例中为Array)。

You might do as follows; 您可以这样做;

 var original = [1, 1, 1, 1, 2], updated = [1, 2, 1, 1, 3], result = (...a) => { var [shorter,longer] = [a[0],a[1]].sort((a,b) => a.length - b.length), s = shorter.slice(); return shorter.reduce((p,c) => { var fil = p.indexOf(c), fis = s.indexOf(c); fil !== -1 && (p.splice(fil,1),s.splice(fis,1)); return p; },longer).concat(s); }; console.log(result(updated,original)); 

You can do it the following steps ( O(n) ). 您可以执行以下步骤( O(n) )。

Let a and b are two arrays 设a和b是两个数组

Step 1. create map hash_map of array a value as key and number occurrences of this key as value. 步骤1.创建数组a映射hash_mapa值作为键,并将此键的出现次数作为值。

Step 2. push all the elements of array b in result which are not in a using hash_map . 步骤2.推阵列的所有元件bresult不属于在a使用hash_map

Step 3. push all the elements of array a in result which are not in b using hash_map . 步骤3.使用hash_map推送result中不在b的数组a所有元素。

Here is complete code 这是完整的代码

 function diff(a, b) { //Step 1 starts here var hash_map = a.reduce(function(map, key) { map[key] = map[key] ? (map[key]+1) : 1; return map; }, {}); //Step 1 ends here //Step 2 starts here var result = b.filter(function(val) { if(hash_map[val]) { hash_map[val] = hash_map[val]-1; return false; } return true; }); //Step 2 ends hers //Step 3 starts here Object.keys(hash_map).forEach(function(key) { while (hash_map[key]) { result.push(key); hash_map[key] = hash_map[key]-1; } }); //Step 3 ends here return result; } console.log(diff([1],[1,1,2])); console.log(diff([1,1,1],[1,1,1,1,1,2])); console.log(diff([1,1,3,4],[1,2,3])); console.log(diff([1,1,1,1,1,2], [1, 2, 1, 1, 3])); 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM