简体   繁体   中英

O(n log m) vs O(n+m) - which is better?

Hi Which of this time complexity is better.

O(n log m) vs O(n + m)

Analyzing these 2 algorithms, which one is better to use practicall on a large scale?

eg) if n and m are exponentially large it becomes

O(2e) vs O(e*(something linear to e))

Sample Problem: Common elements in 2 arrays:

1) 2 pointer approach

2) Using Binary search

I think what you're asking is how best to find the common elements in two sorted arrays. You have the choice of two algorithms:

  1. The "two-pointer" approach, which is O(m + n).
  2. Using binary search to determine if the items in array N are in array M. This is O(n log m).

If the arrays are close to the same size, then the first algorithm, which is strictly linear, is preferable.

If one of the arrays is much larger than the other, then you might want to consider the binary search approach. You'll want to search the larger array for items that are in the smaller array. For example, if you have the arrays M and N, where M has 1,000,000 items and N has 100 items, you have a choice:

  • Look for M in N (ie do 1,000,000 searches on an array of 100)
  • Look for N in M (ie do 100 searches on an array of 1,000,000)

If you look for M in N, then the complexity is O(m log n). Here, m=1000000 , and log(n)=7 (approximately)

If you look for N in M, then the complexity is O(n log m). n=100 and log(m)=20 (approximately).

It's pretty clear in that case which you want to do.

In practice, the cutoff between whether you should use the O(m+n) or O(n log m) (where n is smaller than m) algorithm can only be determined empirically. It's not just a matter of figuring whether (m + n) < (n log m) , because binary search involves a bit of overhead that the two-pointer method doesn't. It's quite likely that the two pointer method would be faster even if (m + n) was double or triple (n log m) .

Neither is unambiguously better, because it depends on the relative values of n and m .

If you assume they're equal, you have O(n log n) vs O(n) , so the second one ( O(n + m) ) is faster. If, on the other hand, n is effectively constant while m grows quickly, then you're looking at O(log m) vs O(m) , so the first one is better.

Basically, the first is better for large m , and the second is better for cases where they're both is large. (If n dominates the equation then they're both just linear.)

In summary, it comes out to be the values of m+n versus (one of mlogn or nlogm) depending upon the lengths of the two arrays you are choosing. Though Big-O complexity analysis does not account for the "base" of logarithm, for problems like these where the values are determining better complexity, a point to note is the base of logarithm is "2" but not "10" where specific examples are taken to calculate the number of elements looked at or the number of comparisons done.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM