How I can merge effectively more sorted subarray to one array. (So as haven't to create new array with same size.)
I have got array:
int LENGHT = 12;
int ARRAY[LENGHT] = {5,8,6,3,48,65,1,8,9,20,21,57};
For example bubble sort in fourth thread:
int ARRAY[LENGHT] = {5,8,6,3,48,65,1,8,9,20,57,21};
thread_1 -> BubbleSort(0, LENGTH/4, ARRAY);
thread_2 -> BubbleSort(LENGTH/4, LENGTH/2, ARRAY);
thread_3 -> BubbleSort(LENGTH/2, (3*LENGTH)/4, ARRAY);
thread_4 -> BubbleSort((3*LENGTH)/4, LENGTH, ARRAY);
I get:
ARRAY[] = {3,5,6, 1,8,8, 9,45,65, 20,21,57};
What is the best way to merging to one array?
{3,5,6, 1,8,8, 9,45,65, 20,21,57} -> {1,3,5, 6,8,8, 9,20,21, 45,57,65}
You can use std::inplace_merge
, like this(c++11):
#include <algorithm>
#include <iostream>
#include <iterator>
template <typename T, size_t N>
char (&ArraySizeHelper(T (&array)[N]))[N];
#define ARRAY_LEN(arr) (sizeof(ArraySizeHelper(arr)))
int main()
{
int ARRAY[] = {3,5,6, 1,8,8, 9,45,65, 20,21,57};
const size_t SORTED_CHUNK_SIZE = 3;
for (size_t i = SORTED_CHUNK_SIZE; i < ARRAY_LEN(ARRAY); i += SORTED_CHUNK_SIZE) {
auto beg = std::begin(ARRAY);
auto mid = beg + i;
auto end = mid + SORTED_CHUNK_SIZE;
std::inplace_merge(beg, mid, end);
}
std::copy(std::begin(ARRAY),
std::end(ARRAY),
std::ostream_iterator<int>(std::cout,",")
);
std::cout << "\n";
}
update
if you use threads, you can use such strategy: let's say you have 4 threads, 4 threads sort 4 chunks of array, then 2 of these 4 threads with std::inplace_merge
merge 4 chunks into 2 chunks, and then 1 of these 4 threads merge this two chunks into 1.
Also look at here: http://en.cppreference.com/w/cpp/experimental/parallelism/existing#inplace_merge
and implentation https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/parallel/quicksort.h
If you must sort an array the absolute fastest utilising all possible hardware threads then you need to create a quicksort that can utilize multiple threads.
A lot of the Standard Template Library all work to perform a couple of functions, Quicksort being one of them. When you call std::sort
it's quite possible that you are calling something that looks like:
template <class ForwardIt>
void sort(ForwardIt first, ForwardIt last)
{
if (first == last) return;
auto pivot = *std::next(first, std::distance(first,last)/2);
ForwardIt middle1 = std::partition(first, last,
[&pivot](const auto& em){ return em < pivot; });
ForwardIt middle2 = std::partition(middle1, last,
[&pivot](const auto& em){ return !(pivot < em); });
sort(first, middle1);
sort(middle2, last);
}
(credit to en.cppreference)
The main function here is std::partition
. If you divide the partitioned range into blocks of equal size based on the number of threads you will end up with partially sorted range - all the elements that return true to the predicate will be before those that returned false. It also crucially returns an iterator to that element - helpfully called middle
in the above example.
By storing these returned iterators in an array you can then
for (auto i=n-2; i>0; --i)
{
auto begin = r[i];
auto mid = std::next(first, b * (i + 1));
auto end = last;
swap_block<Iter>()(begin, mid, last);
}
swap_block<Iter>()(r[0], std::next(first, b), last);
where swap_block
looks like:
template<typename Iter>
struct swap_block
{
void operator()(Iter first, Iter mid, Iter last)
{
std::rotate(first, mid, last);
}
};
Using std::rotate
is very inefficient for large blocks/those where mid
is towards the end of the range. In those cases it would be better to use std::reverse
(and remember kids if you want a stable swap_block
you would need 3 std::reverse
s!)
TL;DR: Learn to use the STL algorithm library effectively.
There is a version of parallel_partition that I have written here .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.