find lowest index of a given value in a presorted array

Question

Hey I had this question in an interview and was wondering what was the best way to solve it. So say you are given an array that is already sorted and you want to find the lowest index of some value x.

Here is a python/pseudocode of what I came up with, I am just wondering if there is a better way to go about it?

def findLowestIndex(arr, x):
     index = binarySearch(0, len(arr), x)
     if index != -1:
         while index > 0:
           if arr[index] == arr[index-1]:
              index -= 1
           else:
              break
     return index

Thanks!

Answer 1

Your method takes linear time in the worst case, which is when the count of x s in the array is O( n ).

An O(lg n ) solution can be obtained by changing the binary search itself to find the first x in the array instead of just any one of them:

def binary_search(x, a):
    lo = 0
    hi = len(a)

    while lo < hi:
        mid = (lo + hi) // 2

        if a[mid] < x:
            lo = mid + 1
        elif a[mid] > x:
            hi = mid
        elif mid > 0 and a[mid-1] == x:
            hi = mid
        else:
            return mid

    return -1

Answer 2

import bisect
l = [1,2,3,4,4,4,4,4,5,5,5,5,5,6,6,7,7,7,8,8]
bisect.bisect_left(l, 4)

EDIT: I just miss a thing. the bisect will give you a insert point. so if x is not in the list you will still have a result index. So you need to check if x is in the list first:

if x in l:
    ....

but for interview question, they may want to see how you come up with the algorithm instead of using the library...

Answer 3

If the elements are integers - or enumerated, you can do it a bit faster:

Note that in binary search [the algorithm, not the python function], if an element doesn't exist - you can find the smallest element that is bigger then the index.

first search for x - and get the index, let it be i .
next, search for x-1 . If it is not in the list, the binary search can find you the first index if x .
If it is in the list, let the index found be j :
- Do a binary search on the sublist from j to i , and search for an element such that list[k] < list[k+1]

For not enumerated values , it can also be done by the same idea of decreasing ranges while list[k] < list[k+1] and list[k+1] == x but I find it simpler to understand first how it is done for integers, and then applying it for general solution.

Note that this solution is O(logn) , while the trivial solution you propose is O(n) , in list with a lot of dupes, because of the iterative step after the binary search.

Answer 4

A recursive version, if anyone's interested...

Rewrite from https://github.com/reneargento/algorithms-sedgewick-wayne/blob/master/src/chapter1/section4/Exercise10.java , a exercise from Algorithms 4th .

def binary_search(key, a, low, high):
    if low > high:
        return -1;
    middle = low + (high - low) / 2;
    if a[middle] < key:
        return binary_search(key, a, middle + 1, high)
    elif a[middle] > key:
        return binary_search(key, a, low, middle - 1)
    else:
        if (middle - 1 >= 0) and (a[middle - 1] != key):
            return middle
        else:
            index = binary_search(key, a, 0, middle - 1)
            if(index == -1):
                return middle;
            else:
                return index;

a = [1,2,3,3,3,3,4,5,6,7,7,8,9]

print(binary_search(7, a, 0, len(a)))

Doesn't recursive version always look more straightforward than non-recursive version? Why does this looks much harder..? Can anyone write a better recursive version :D?

Answer 5

If x is not in X such that f(x) = v then the answer is trivial: Binary search to find that out.

If there is one x such that f(x) = v then the answer is also trivial: Binary search to find that out.

The problem is only interesting if there are multiple x 's such that f(x) = v . If there are a constant number of x 's then algorithmically a binary search is optimal. Just binary search and check lower indices sequentially.

What if, though, there are a lot of these x 's? A sequential search like that is obviously not optimal. In fact, if there are c * |X| x 's then this runs in O(|X|) .

Instead what could be done is initialize lbound to 0 and binary search until you find the element, at i , where every time you go right, update lbound to the mid that was just used. Then binary search from [lbound, i - 1] . Do this until i == lbound or you don't find an element. If the former occurs, the desired index is 0 . If the latter occurs, the desired index is the previous i used. The worst case is the desired index is 0 .

The interesting thing is this still runs in log(|X|) time, I think.

Answer 6

deferred detection of equality approach in binary search gives the smallest index, reducing equality branches.

def binary_search(low, high, target, array):
    while low < high:
        mid = low + (high - low) / 2
        if a[mid] < target:
            low = mid + 1
        else:
            high = mid

    if (array[low] == target) return low
    else return -1

Answer 7

I'd bet good money that gddc's comment is the fastest answer for python. Otherwise, your general algorithm is correct, except for the fact that in some cases you can beat the O(log n) behavior of binary search. Specifically, in the case of integers the best worst-case behavior you can get is O(sqrt(log n)): https://stackoverflow.com/questions/4057258/faster-than-binary-search-for-ordered-list

Answer 8

修改二进制搜索以首次查找x的任何出现。

find lowest index of a given value in a presorted array

Question

8 answers

solution1
7 ACCPTED 2012-04-13 22:14:15

solution2
3 2012-04-13 22:08:22

solution3
1 2012-04-13 22:14:31

solution4
1 2018-12-03 13:48:41

solution5
0 2012-04-13 22:16:04

solution6
0 2014-01-02 15:14:34

solution7
-1 2012-04-13 22:13:14

solution8
-1 2012-04-14 09:34:10

find lowest index of a given value in a presorted array

Question

8 answers

solution1 7 ACCPTED 2012-04-13 22:14:15

solution2 3 2012-04-13 22:08:22

solution3 1 2012-04-13 22:14:31

solution4 1 2018-12-03 13:48:41

solution5 0 2012-04-13 22:16:04

solution6 0 2014-01-02 15:14:34

solution7 -1 2012-04-13 22:13:14

solution8 -1 2012-04-14 09:34:10

solution1
7 ACCPTED 2012-04-13 22:14:15

solution2
3 2012-04-13 22:08:22

solution3
1 2012-04-13 22:14:31

solution4
1 2018-12-03 13:48:41

solution5
0 2012-04-13 22:16:04

solution6
0 2014-01-02 15:14:34

solution7
-1 2012-04-13 22:13:14

solution8
-1 2012-04-14 09:34:10