简体   繁体   中英

How can I choose between merge-sort and insertion sort?

I need to implement the fastest sorting algorithm to sort a linked list which is created using stdin.

I know merge-sort's time complexity is O(n logn) and insertion-sort's is O(n^2) (being n the number of elements in the linked list).

But the list is created by standard input so, is it still more efficient to use merge-sort on the unsorted list or is it more efficient to create the list by insertion-sort, which means sorting the list on input?

This is the structure:

#define SIZE 50

struct node {
   int num;
   char name[SIZE];
   struct node* next;
};

These are the sorting criteria:

1. Sorted alphabetically by "name".
2. When the name is the same it's sorted by "num" (from higher to lower).

Indeed Insertion Sort is an online algorithm, while Merge-sort is an offline algorithm. However:

Not every offline algorithm has an efficient online counterpart.

And that's the case here too.

You should prefer Insertion Sort over Merge-sort when the data are already partially sorted, since the former is an Adaptive algorithm.

Note: Insertion Sort is also preferred for small sized inputs .

You can improve the costs of your insertion sorting by using a more complex storage than list or vector, such as a heap (see heap sort), or a set or map, either binary tree type or hashmap type which have Ologn per element insertion costs. [Hashmap has zero cost while you don't collide on the primary page, but is effectively a squashed tree if you can generate complete secondary/tertiary/etc hashes. If your secondary system is linear then better hope your primary hash is good.]

Traditionally, heapsort is favoured, as it uses a vector storage, but order logn insertion, so Onlogn overall. Of course you pay logn again to extract each element in sorted order, so overall it is 2xOnlogn with is still classed as Onlogn.

For completeness: Sorted link lists suffer because the search for the insert point is linear. Sorted vectors suffer because, though it is easy to find the insert point, making room involves copying (on average) half the vector content each time.

With a max of 100 entries, it's not going to matter much. Either insertion sort or merge sort would take less than 20 microseconds to sort a list of 100 nodes. Even a million nodes only takes about 0.3 seconds to sort in Java using bottom up merge sort for linked list, and about 2/3rd's of that for C/C++, on my system (Intel 3770K, Windows 7 Pro 64 bit).

For large lists, if there is enough memory, it's fastest to copy the linked list to an array, merge sort the array, and create a new linked list. The array sort is much faster because the elements are moved into cache friendly groups, while a linked list sort changes links instead of moving nodes, which isn't cache friendly if the nodes are randomly scattered (worst case a cache miss on every node accessed).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM