简体   繁体   English

合并Java中的n个排序数组

[英]Merging n sorted arrays in Javascript

I have n (n between 1 and 100) sorted number arrays, each with m elements (m around 1000 in my case). 我有n个(n在1到100之间)排序的数字数组,每个数组都有m个元素(在我的情况下,m约为1000)。 I want to merge them into a single sorted array. 我想将它们合并为单个排序的数组。

I can think of two possibilities for doing this: 我可以想到这样做的两种可能性:

1.Use a two arrays merging algo (like merge() function below from http://www.nczonline.net/blog/2012/10/02/computer-science-and-javascript-merge-sort/ ) and applying it iteratively (1st and 2nd, then merge of 1st-2nd and 3rd, etc) 1.使用两个数组合并算法(例如下面的http://www.nczonline.net/blog/2012/10/02/computer-science-and-javascript-merge-sort/的 merge()函数)并将其应用迭代(第一和第二,然后合并第一和第二和第三,依此类推)

  function merge(left, right) {
      var result  = [],
        il      = 0,
        ir      = 0;
      while (il < left.length && ir < right.length){
        if (left[il] < right[ir]){
            result.push(left[il++]);
        } else {
            result.push(right[ir++]);
        }
    }
    return result.concat(left.slice(il)).concat(right.slice(ir));
}
  1. Generalize merge() function to n arrays simultaneously. 同时merge()函数泛化为n个数组。 At each iteration, I would pick the min value of the n first values not yet processed and append it to the result. 在每次迭代中,我将从尚未处理的n个第一个值中选取最小值,并将其附加到结果中。

Are these two algo equivalent in terms of complexity ? 就复杂性而言,这两种算法是否等效? I have the feeling that both algo are in o(m*n). 我觉得两种算法都在o(m * n)中。 Am I right ? 我对吗 ?

Are there any performance consideration to take one algo rather than the other ? 是否有考虑使用一种算法而不是另一种算法的性能? I have the feeling that 1 is simpler than 2. 我觉得1比2更简单。

Merge n arrays using priority queue (based on binary heap, for example). 使用优先级队列合并n个数组(例如,基于二进制堆)。
Overall element count is m*n, so algorithm complexity is O(m * n * Log(n)). 元素总数为m * n,因此算法复杂度为O(m * n * Log(n))。 algorithm sketch: 算法示意图:

Add numbers 1..n to priority queue, using 1st element of every 
array as sorting key 
(you may also use pairs (first element/array number).
At every step - 
  J = pop_minimum
  add current head of Jth array to result
  move head of Jth array to the right
  if Jth array is not exhausted, insert J in queue (with new sorting key)

1st algoritm complexity is 第一算法的复杂度是

2*m + 3*m+ 4*m+...+n*m = m * (n*(n-1)/2-1) =  O(n^2 * m)

That's an old question, but for the sake of posterity: 这是一个老问题,但是为了后代:

Both algos are indeed O(n*m). 两种算法的确都是O(n * m)。 In algo 1, you have to remerge for each m array. 在算法1中,您必须为每个m数组重新合并。 In algo 2, you do just one big merge, but picking out the minimum from m arrays is still linear. 在算法2中,您只进行了一次大合并,但是从m个数组中选取最小值仍然是线性的。

What I did instead was implement a modified version of merge sort to get O(mlogn). 相反,我要做的是实现合并排序的修改版本以获取O(mlogn)。

The code is there on GitHub https://github.com/jairemix/merge-sorted if anyone needs it. 如果有人需要,代码可以在GitHub https://github.com/jairemix/merge-sorted上找到。

Here's how it works 运作方式如下

The idea is to modify algo 1 and merge each array pairwise instead of linearly. 这个想法是修改算法1并成对而不是线性地成对合并每个数组。

So in the first iteration you would merge array1 with array2, array3 with array4, etc. 因此,在第一次迭代中,您将array1与array2合并,array3与array4合并,等等。

Then in the second iteration, you would merge array1+array2 with array3+array4, array5+array6 with array7+array8, etc. 然后在第二次迭代中,将array1 + array2与array3 + array4,array5 + array6与array7 + array8等合并。

For example: 例如:

// starting with:
[1, 8], [4, 14], [2, 5], [3, 7], [0, 6], [10, 12], [9, 15], [11, 13]

  // after iteration 1:
  [1, 4, 8, 14],  [2, 3, 5, 7],   [0, 6, 10, 12],   [9, 11, 13, 15]

    // after iteration 2
    [1, 2, 3, 4, 5, 7, 8, 14],      [0, 6, 9, 10, 11, 12, 13, 15]

        // after iteration 3
        [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

In JS: 在JS中:

function mergeSortedArrays(arrays) {

  // while there are still unmerged arrays
  while (arrays.length > 1) {
    const result = [];

    // merge arrays in pairs
    for (let i = 0; i < arrays.length; i += 2) {
      const a1 = arrays[i];
      const a2 = arrays[i + 1];

      // a2 can be undefined if arrays.length is odd, so let's do a check
      const mergedPair = a2 ? merge2SortedArrays(a1, a2) : a1;
      result.push(mergedPair);
    }

    arrays = result;
  }

  // handle the case where no arrays is input
  return arrays.length === 1 ? arrays[0] : [];

}

Notice the similarity to merge sort. 注意合并排序的相似性。 In fact in merge sort, the only difference is that n = m, so you're starting further back with m presorted arrays of 1 item each. 实际上,在归并排序中,唯一的区别是n = m,因此您将从m个预排序的数组(每个数组分别包含1个项)开始。 Hence the O(mlogm) complexity of merge sort. 因此,合并排序的O(mlogm)复杂度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM