简体   繁体   中英

Find subsequence of size k such that the minimum distance between values is maximum

Suppose i have a random sequence( ordering array ) which contain n positive float. How to find subsequence of size k such that the minimum distance between all pairs of float in the subsequence is maximized, i mean they are at farthest distance.

Note : A subsequence of a sequence is an ordered subset of the sequence's elements having the same sequential ordering as the original sequence.

CONSTRAINTS

  1. n>10^5
  2. n>k>2

example :

sequence a[]={1.1,2.34,6.71,7.01,10.71} and k=3, subsequence = {1.1,6.71,10.71}, the minimum distance is 4 between 10.71 and 6.71.

Wrong subsequence :

{1.1,7.01,10.71}, minimum distance is 3.7

{1.1,2.34,6.71}, minimum distance is 1.24

I came up with a solution:

1) sort array

2) select a[0], now find ceil(a[0]+ x) = Y in array....and then ceil(Y+ x) and so on k-1 times, also kth element will be a[n-1]

To find x:

dp[i,j] be the x for selecting j elements from first i elements.

Finally we want dp[n][k] which is x

But i am facing problem in finding x and reordering the indexes.

dp[i,j] = max( min( dp[k,j-1], dp[i]-A[k] ) )

over k=1 to i-1, i=2 to n, j=2 to i

dp[i][1] = 0 over i = 1 to n

I want to correct the dynamic programming solution, though i know x can be found out by binary searching over x, but by sorting i loose ordering of sequence and time consuming( O ( n^2 )).How do i overcome this problems?

If there is a solution involving a sort, you first want to map the array to an array of tuples, which contain a value and a position of the element. Now when you sort the array you know the original positions as well.

However I don't believe that sorting actually helps you in the end.

The approach that I see which works is for each 0 <= i < n , for each 1 < j <= min(k, i+1) , to store the minimum distance and previous entry for the best subsequence of length j ending at i .

You then look for the best subsequence of length k . And then decode the subsequence.

Using JSON notation (for clarity, and not because I this is the right data structure), and your example, you could wind up with a data structure like this:

[
    {"index": 0, "value": 1.1},
    {"index": 1, "value": 2.34,
        "seq": {2: {"dist": 1.34, "prev": 0}},
    {"index": 2, "value": 6.71,
        "seq": {2: {"dist": 5.61, "prev": 0},
                3: {"dist": 1.34, "prev": 1}},
    {"index": 3, "value": 7.01,
        "seq": {2: {"dist": 5.91, "prev": 0},
                3: {"dist": 1.34, "prev": 1}},
    {"index": 4, "value": 10.71,
        "seq": {2: {"dist": 9.61, "prev": 0},
                3: {"dist": 4, "prev": 2}}
]

And now we find that the biggest dist for length 3 is 3.7 at index 4 . Walking backwards we want index 4 , 2 and 0 . Pull those out and reverse them to get the solution of [1.1, 6.71, 10.71]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM