简体   繁体   中英

What is the relation between merges and number of items in a in k-way merge

The question is: In a k-way merge, how many merge operation will we perform. For example: 2-way merge:2 nodes 1 merge; 3 nodes 2 merge; 4 nodes 3 merge. So we get M(n)=n-1.

What the the M(n) when k is arbitrary?

2-way merges are most efficient when merging equal-sized blocks, so the most efficient k -way merge based on 2-way merges is to first merge block 1 with block 2, block 3 with block 4, and so on, then merge the first two resulting blocks, and so on. This is basically how mergesort works, and results in O( kn log k ) time, assuming each of the k blocks contains n items. But it's only perfectly efficient if all blocks have exactly n items, and k is a power of 2, so...

Instead of performing k separate merge passes, you can use a single pass that uses a heap containing the first item of each block (ie k items in total):

  1. Read the lowest item from the heap (O(log k ) time)
  2. Write it out
  3. Remove it from the heap
  4. If the block that that item came from is not yet exhausted, place the next item from it into the heap (O(log k ) time again).
  5. Repeat until the heap is empty.

If there are a total of kn items, this always takes O( kn log k ) time regardless of how they are distributed amongst blocks, and regardless of whether k is a power of 2. Your heap needs to contain (item, block_index) pairs so that you can identify which block each item comes from.

OK, to answer the original question as stated:

To merge k blocks using a sequence of 2-way merges always requires exactly k - 1 merges, since regardless of what pair of blocks you choose to merge at any point in time, merging them reduces the total number of blocks by 1.

As I said in my original answer, which pairs of blocks you choose to merge does impact the overall time complexity -- it's better to merge similar-sized blocks -- but it doesn't affect the number of 2-way merge operations.

yes, The heap way may be more effective. But what's the answer to the orginal question? I found there may be no answer about that since it is maybe not a full k-way tree, so 4-way could regress to 3-way, 2-way.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM