简体   繁体   中英

Minimum of maximums for k-size nonconsecutive subsequence of array

Suppose I have an array, arr = [2, 3, 5, 9] and k = 2. I am supposed to find subsequences of length k such that no two elements in each subsequence are adjacent. Then find the maximums of those sequences. Finally, find the minimum of the maximums. For example, for arr, the valid subsequences are [2,5], [3,9], [2,9] with maximums 5, 9, and 9 respectively. The expected output would be the minimum of the maximums, which is 5.

I can't think of any other way for this problem other than brute force, which is to find all possible subsequences with nested for-loops, then find the max and min max. According to the req, there is a better way, but I don't know what that could be. Greedy? DP?

I was thinking how to solve this problem for some time. Eventually I came up with a solution. But, I wasn't sure about it, so I decided to publish my solution as a separate question. Here it is: Minimum of maximums of non-consecutive subsequences of size k .

I decided to wait for some time for answers or comments that would either ensure me this solution is right, or would give me tips how to improve it. Now that it passed a week and I personally don't see ways to improve it, I publish it here.

I suppose this solution might not be the most efficient one. At least I hope it is correct, at least to the best of my abilities to verify its correctness.

Solving this problem I came up with two assumptions that are not stated in this question. I hope they make the problem easier to solve. They are:

  1. The elements of the input sequence are unique.

  2. For the input subsequence S and the number k , 2 <= k <= (length(S) + 1) / 2 . I find this assumption in itself necessary as it rejects cases when the subsequences cannot be determined at all, and so cannot be the minimum of their maximums.

I plan to try to remove the first assumption. But I won't do it if it is going to make the algorithm considerably more difficult to prove the correctness of, or to implement.

Pseudocode, version 1

find_k_length_sequence_maxes_min (S, k)
    if k < 2 or length(S) < 2 * k - 1
        return NO_SUCH_MINIMUM

    sorted = copy(S)
    sort_ascending(sorted)

    for t from k - 1 to length(S)
        current_length = 0

        index = find_index(S, sorted[t])

        last_index = index

        for u descending from index to 1
            if u < last_index - 1 && S[u] <= sorted[t]
                current_length += 1
                last_index = u

            if current_length >= k
                return sorted[t]

        last_index = index

        for u ascending from index to length(S)
            if u > last_index + 1 and S[u] <= sorted[t]
                current_length += 1
                last_index = u

            if current_length >= k
                return sorted[t]

Pseudocode, version 2

(This is the same algorithm as in version 1, only written using more natural language.)

(1) Let S be a sequence of integers such that all of its elements are unique.

(2) Let a "non-contiguous subsequence of S " mean such a subsequence of S that any two elements of it are non-adjacent in S .

(3) Let k be an integer such that 2 <= k <= (length(S) + 1) / 2 .

(4) Find the minimum of maximums of all the non-contiguous subsequences of S of length k .

(4.1) Find the minimal element of S such that it is the maximum of a non-contiguous subsequence of S of size k .

(4.1.1) Let sorted be a permutation of S such that its elements are sorted in ascending order.

(4.1.2) For every element e of sorted , check whether it is a maximum of a non-contiguous subsequence of S of length k . If it is, return it.

(4.1.2.1) Let x and y be integers such that 1 <= x <= index(minmax(k)) and index(minmax(k)) <= y <= length(S) .

(4.1.2.2) Let all(x, y) be the set of all the non-contiguous subsequences of S between S[x] (including) and S[y] (including) such that e is the maximum of each of them.

(4.1.2.3) Check whether the length of the longest sequence of all(1, index(e)) is greater than or equal to k . If it is, return e .

(4.1.2.4) Check whether the sum of the lengths of the longest subsequence of all(1, index(e)) and the length of the longest subsequence of all(index(e), length(S)) is greater than or equal to k . If it is, return e .

Proof of correctness

I don't remember if I have ever written any proof of correctness for a program, so I'm rather certain the quality if this one can be improved. Take it with a grain of salt. If you can, I encourage you to check it yourself whether the algorithm is correct.

(1) Glossary:

  • by "observation" I mean a statement not derived from any observation or conclusion, not demanding a proof,

  • by "conclusion" I mean a statement derived from at least one observation or conclusion, not demanding a proof,

  • by "theorem" I mean a statement not derived from any observation or conclusion, demanding a proof.

(2) Let S be a sequence of integers such that all of its elements are unique.

(3) Let a "non-contiguous subsequence of S " mean such a subsequence of S that any two elements of it are non-adjacent in S .

(4) Let k be an integer such that 2 <= k <= (length(S) + 1) / 2 .

(5) Let minmax(k) be an element of S such that it is the minimum of maximums of all the non-contiguous subsequences of S of length k .

(6) (Theorem) minmax(k) is a minimal element of S such that it is a maximum of a non-contiguous subsequence of S of length k .

(7) In other words, there is no element in S less than minmax(k) that is a maximum of a non-contiguous subsequence of S of length k .

(8) (Proof of (6)) (Observation) Since minmax(k) is the minimum of maximums of all the non-contiguous sequences of S of length k , there is no non-contiguous subsequence of S of length k such that its maximum is greater than minmax(k) .

(9) (Proof of (6)) (Conclusion) If (6), then any element of S less than minmax(k) cannot be a maximum of any non-contiguous subsequence of S of length k .

(10) (Proof of (6)) QED

(11) Let x and y be integers such that 1 <= x <= index(minmax(k)) and index(minmax(k)) <= y <= length(S) .

(12) Let all(x, y) be the set of all the non-contiguous subsequences of S between S[x] (including) and S[y] (including) such that minmax(k) is the maximum of each of them.

(13) (Observation) minmax(k) is the maximum of the longest sequence of all(1, length(S)) .

(14) This observation may seem too trivial to note. But, apparently it was easier for me to write the algorithm, and prove it, with the longest subsequence in mind, instead of a subsequence of length k . Therefore I think this observation is worth noting.

(15) (Theorem) One can produce the longest sequence of all(1, index(minmax(k))) by:

  • starting from minmax(k) ,

  • moving to S[1] ,

  • taking always the next element that is both less than or equal to minmax(k) , and non-adjacent to the last taken one.

(16) (Proof of (15)) Let a "possible element" of S mean an element that is both less than or equal to minmax(k) , and non-adjacent to the last taken one.

(16a) (Proof of (15)) Let C be the subsequence produced in (15).

(17) (Proof of (15)) (Observation)

  • Before the first taken element, there is exactly 0 possible elements,

  • between any two taken elements (excluding them), there is exactly 0 or 1 possible elements,

  • after the last taken element, there is exactly 0 or 1 possible elements.

(18) (Proof of (15)) Let D be a sequence of all(1, index(minmax(k))) such that length(D) > length(C) .

(19) (Proof of (15)) At least one of the following conditions is fulfilled:

  • before the first taken element, there is less than 0 possible elements in D ,

  • between two taken elements (excluding them) such that there is 1 possible elements between them in C , there is 0 possible elements in D ,

  • after the last taken element, there is less than 1 possible element in D .

(20) (Proof of (15)) (Observation)

  • There cannot be less than 0 possible elements before the first taken element,

  • if there is less than 1 possible elements between two taken elements (excluding them) in D , where in C there is 1, it means that we have taken either an element greater than minmax(k) , or an element adjacent to the last taken one, which contradicts (12),

  • if there is less than 1 possible element between the last taken element in D , where in C there is 1, it means that we have taken either an element greater than minmax(k) , or an element adjacent to the last taken one, which contradicts (12).

(21) (Proof of (15)) QED

(22) (Observation) (15) applies also to all(index(minmax(k)), length(S)) .

(23) (Observation) length(all(1, length(S))) = length(all(1, index(minmax(k)))) + length(all(index(minmax(k)), length(S))) .

Implementation

All the tests pass if any of the assert calls does not abort the program.

#include <limits.h> // For INT_MAX
#include <assert.h> // For assert
#include <string.h> // For memcpy
#include <stdlib.h> // For qsort

int compar (const void * first, const void * second) {
    if (* (int *)first < * (int *)second) return -1;
    else if (* (int *)first == * (int *)second) return 0;
    else return 1;
}

void find_k_size_sequence_maxes_min (int array_length, int array[], int k, int * result_min) {
    if (k < 2 || array_length < 2 * k - 1) return;

    int sorted[array_length];
    memcpy(sorted, array, sizeof (int) * array_length);
    qsort(sorted, array_length, sizeof (int), compar);

    for (int t = k - 1; t < array_length; ++t) {
        int index = -1;
        while (array[++index] != sorted[t]);

        int size = 1;

        int last_index = index;
        for (int u = index; u >= 0; --u) {
            if (u < last_index - 1 && array[u] <= sorted[t]) {
                ++size;
                last_index = u;
            }

            if (size >= k) {
                * result_min = sorted[t];
                return;
            }
        }

        last_index = index;
        for (int u = index; u < array_length; ++u) {
            if (u > last_index + 1 && array[u] <= sorted[t]) {
                ++size;
                last_index = u;
            }

            if (size >= k) {
                * result_min = sorted[t];
                return;
            }
        }
    }
}

int main (void) {
    // Test case 1
    int array1[] = { 6, 3, 5, 8, 1, 0, 9, 7, 4, 2, };
    int array1_length = (int)((double)sizeof array1 / sizeof (int));
    int k = 2;
    int min = INT_MAX;
    find_k_size_sequence_maxes_min(array1_length, array1, k, & min);
    assert(min == 2);

    // Test case 2
    int array2[] = { 1, 7, 2, 3, 9, 11, 8, 14, };
    int array2_length = (int)((double)sizeof array2 / sizeof (int));
    k = 2;
    min = INT_MAX;
    find_k_size_sequence_maxes_min(array2_length, array2, k, & min);
    assert(min == 2);

    // Test case 3
    k = 3;
    min = INT_MAX;
    find_k_size_sequence_maxes_min(array2_length, array2, k, & min);
    assert(min == 8);

    // Test case 4
    k = 4;
    min = INT_MAX;
    find_k_size_sequence_maxes_min(array2_length, array2, k, & min);
    assert(min == 9);

    // Test case 5
    int array3[] = { 3, 5, 4, 0, 8, 2, };
    int array3_length = (int)((double)sizeof array3 / sizeof (int));
    k = 3;
    min = INT_MAX;
    find_k_size_sequence_maxes_min(array3_length, array3, k, & min);
    assert(min == 3);

    // Test case 6
    int array4[] = { 18, 21, 20, 6 };
    int array4_length = (int)((double)sizeof array4 / sizeof (int));
    k = 2;
    min = INT_MAX;
    find_k_size_sequence_maxes_min(array4_length, array4, k, & min);
    assert(min == 18);

    // Test case 7
    int array5_length = 1000000;
    int array5[array5_length];
    for (int m = array5_length - 1; m >= 0; --m) array5[m] = m;
    k = 100;
    min = INT_MAX;
    find_k_size_sequence_maxes_min(array5_length, array5, k, & min);
    assert(min == 198);
}

Please comment if you have any questions or tips, or see any mistakes or bugs.


Edit: As I've written, I tried to remove the first assumption. I think I succeed, that is, that this assumption can be removed.

There were only few changes required. Worth noting is the fact that now I use all the occurrences of the terms "minimum" and "maximum" with the indefinite article "a". By that I want to express that there could be more than one element of S that has the minimum value, and more than one element of S that has the maximum value.

Pseudocode, version 1 without elements uniqueness

The line

index = find_index(S, sorted[t])

should be replaced with the line

index = find_first_index(S, sorted[t])

Pseudocode, version 2 without elements uniqueness

(This is the same algorithm as in version 1, only written using more natural language.)

(1) Let S be a sequence of integers.

(2) Let a "non-contiguous subsequence of S " mean such a subsequence of S that any two elements of it are non-adjacent in S .

(3) Let k be an integer such that 2 <= k <= (length(S) + 1) / 2 .

(4) Find a minimum of maximums of all the non-contiguous subsequences of S of length k .

(4.1) Find a minimal element of S such that it is a maximum of a non-contiguous subsequence of S of size k .

(4.1.1) Let sorted be a permutation of S such that its elements are sorted in ascending order.

(4.1.2) For every element e of sorted , check whether it is a maximum of a non-contiguous subsequence of S of length k . If it is, return it.

(4.1.2.1) Let x and y be integers such that 1 <= x <= index(minmax(k)) and index(minmax(k)) <= y <= length(S) .

(4.1.2.2) Let all(x, y) be the set of all the non-contiguous subsequences of S between S[x] (including) and S[y] (including) such that e is a maximum of each of them.

(4.1.2.3) Check whether the length of the longest sequence of all(1, index(e)) is greater than or equal to k . If it is, return e .

(4.1.2.4) Check whether the sum of the lengths of the longest subsequence of all(1, index(e)) and the length of the longest subsequence of all(index(e), length(S)) is greater than or equal to k . If it is, return e .

Proof without elements uniqueness

Point (2) should now be:

(2) Let S be a sequence of integers.

Point (5) should now be:

(5) Let minmax(k) be an element of S such that it is a minimum of maximums of all the non-contiguous subsequences of S of length k .

Point (8) should now be:

(8) (Proof of (6)) (Observation) Since minmax(k) is a minimum of maximums of all the non-contiguous sequences of S of length k , there is no non-contiguous subsequence of S of length k such that its maximum is greater than minmax(k) .

Point (12) should now be:

(12) Let all(x, y) be the set of all the non-contiguous subsequences of S between S[x] (including) and S[y] (including) such that minmax(k) is a maximum of each of them.

Implementation without elements uniqueness

There should be added the following test cases:

    // Test case 8 (no uniqueness)
    int array6[] = { 18, 21, 21, 6 };
    int array6_length = (int)((double)sizeof array6 / sizeof (int));
    k = 2;
    min = INT_MAX;
    find_k_size_sequence_maxes_min(array6_length, array6, k, & min);
    assert(min == 18);

    // Test case 9 (no uniqueness)
    int array7[] = { 18, 21, 18, 6 };
    int array7_length = (int)((double)sizeof array7 / sizeof (int));
    k = 2;
    min = INT_MAX;
    find_k_size_sequence_maxes_min(array7_length, array7, k, & min);
    assert(min == 18);

    // Test case 10 (no uniqueness)
    int array8[] = { 18, 18, 20, 6 };
    int array8_length = (int)((double)sizeof array8 / sizeof (int));
    k = 2;
    min = INT_MAX;
    find_k_size_sequence_maxes_min(array8_length, array8, k, & min);
    assert(min == 18);

    // Test case 11 (no uniqueness)
    int array9[] = { 18, 18, 21, 6 };
    int array9_length = (int)((double)sizeof array9 / sizeof (int));
    k = 2;
    min = INT_MAX;
    find_k_size_sequence_maxes_min(array9_length, array9, k, & min);
    assert(min == 18);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM