简体   繁体   中英

Algorithm for finding max value of functions of the form f(x) = a*min(b, x)?

I have an array of tuples (a, b) with a > 0 and b > 0 .
Each tuple represents a function f such that f(x, a, b) = a * min(b, x) .

Is there a known algorithm for a given x to find which tuple returns the maximum value?
I don't want to evaluate each function to check the maximum, because I will query this array arbitrary number of times for different x .

Example:

array = [ (1, 10), (2, 3) ]
x < 6 -> choose (2, 3)
x = 6 (intersection point) -> either (1, 10) or (2, 3) doesn't matter
x > 6 -> choose (1, 10)

So the problem is that these tuples can be either sorted by a or by b . But there can be a lot of intersection points between them (if we visualize them as graphs). So I want to avoid any O(n^2) sorting algorithm to check for certain ranges of x which is the best function. I mean I don't want to compare each function with all the others to find from which point x' (intersection point) and on I should choose one over the other.

Assuming a 's, b 's and queried x 's are always nonnegative, each query can be done in O(log(n)) time after an O(n*log(n)) preprocessing step:

The preprocessing step eliminates such functions that are strictly dominated by others. For example, (5, 10) is larger than (1, 1) for every x. (So, if there is (5, 10) in the array, then we can remove (1, 1) because it will never be the maximum for any x.)

Here is the general condition: A function (a, b) is larger than (c, d) for every x if and only if c > a and (c*d > a*b) . (This is easy to prove.)

Now, what we want to do is to remove such functions (a, b) for which there exists a (c, d) such that c > a and (c*d > a*b) . This can be done in O(n*log(n)) time:

1 - Sort tuples lexicographically. What I mean by lexicographically is first compare their first coordinates, and if they are equal, then compare the second ones. For example, a sorted array might look like this:

(1, 5)
(1, 17)
(2, 9)
(4, 3)
(4, 4)

2 - Iterate over the sorted array in the reverse order and keep track of the largest value of a*b that you encountered so far. Let's call this value M . Now, assume the element that we are processing in the loop is (a, b) . If a*b < M , we remove this element. Because for some (c, d) that we processed earlier, both c > a and c*d > a*b , and thus (a, b) is useless. After this step, the example array will become:

(2, 9)
(4, 4)

(4, 3) was deleted because it was dominated by (4, 4) . (1, 17) and (1, 5) were deleted because they are dominated by (2, 9) .

Once we get rid of all the functions that are never the maximum for any x, the graph of the remaining ones will look like this .

As seen in the graph, every function is the maximum from the point where it intersects with the one before to the point where it intersects with the one after. For the example above, (4, 4) and (2, 9) intersect at x = 8 . So (4, 4) is the maximum until x = 8 , and after that point, (2, 9) is the maximum. We want to calculate the points where consecutive functions in the array intersect, so that for a given x, we can binary-search on these points to find which function returns the maximum value.

The key to efficiency is to avoid useless work. If you imagine a decision tree, pruning branches is a term often used for that.

For your case, the decision-making is based on choosing between two functions (or tuples of parameters). In order to select either of the two functions, you just determine the value x at which they give you the same value. One of them performs better for smaller values, one for larger values. Also, don't forget this part, it may be that one function always performs better than the other. In that case, the one performing worse can be removed completely (see also above, avoiding useless work.).

Using this approach, you can map from this switchover point to the function on the left. Finding the optimal function for an arbitrary value just requires finding the next higher switchover point.

BTW: Make sure you have unit tests in place. These things are fiddly, especially with floating point values and rounding errors, so you want to make sure that you can just run a growing suite of tests to make sure that one small bugfix didn't break things elsewhere.

I think you should sort array based on 'b' first and then 'a'. Now for every x just use binary search and find the position from which min(b,x) will give either only b or x depending on value. So from that point if x is small then all the upcoming value of b then take tuple as t1 and and you can count value using that function and for the value of b which will be less than x you compulsorily need traverse. I'm not sure but that's what I can think.

After pre-processing the data, it's possible to calculate this maximum value in time O(log(n)) , where n is the number of tuples (a, b) .

First, let's look at a slightly simpler question: You have a list of pairs (c, b) , and you want to find the one with the largest value of c , subject to the condition that b<=x , and you want to do this many times for different values of x . For example, the following list:

 c   b
------
11  16
 8  12
 2   6
 7   9
 6  13
 4   5

With this list, if you ask with x=10 , the available values of c are 2, 7 and 4, and the maximum is 7.

Let's sort the list by b :

 c   b
------
 4   5
 2   6
 7   9
 8  12
 6  13
11  16

Of course, some values in this list can never give an answer. For example, we can never use the b=2 , c=6 row in an answer, because if 6<=x then 5<=x , so we can use the c=4 row to get a better answer. So we might as well get rid of pairs like that in the list, ie all pairs for which the value of c is not the highest so far. So we whittle the list down to this:

 c   b
------
 4   5
 7   9
 8  12
11  16

Given this list, with an index on b , it's easy to find the highest value of c . All you have to do is find the highest value of b in the list which is <=x , then return the corresponding value of c .

Obviously, if you change the question so that you only want the values with b>=x (instead of b<=x ), you can do exactly the same thing.

Right. So how does this help with the question you asked?

For a given value of x , you can split the question into 2 questions. If you can answer both of these questions then you can answer the overall question:

  1. Of the pairs (a, b) with b<=x , which one gives the highest value of f(x,a,b) = a*b ?
  2. Of the pairs (a, b) with b>=x , which one gives the highest value of f(x,a,b) = a*x ?

For (1), simply let c=a*b for each pair and then go through the whole indexing rigmarole outlined above.

For (2), let c=a and do the indexing thing above, but flipped round to do b>=x instead of b<=x ; when you get your answer for a , don't forget to multiply it by x .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM