简体   繁体   English

O(m * n) + O((m + n) * log(m + n)) 的复杂度评估是多少

[英]What is the complexity evaluation of O(m * n) + O((m + n) * log(m + n))

I have following Python code, a and b are lists(I know it isn't best way of getting intersections):我有以下 Python 代码,a 和 b 是列表(我知道这不是获得交集的最佳方式):

def get_intersection(a, b):
    a = set(a)
    b = set(b)
    c = sorted(list(a&b))
    return c

Let's call len(a) - m and len(b) - n, where is no additional information about a and b.让我们调用 len(a) - m 和 len(b) - n,其中没有关于 a 和 b 的附加信息。 Then time complexity of given code is O(m) + O(n) + O(m * n) + O((m + n) * log(m + n)).那么给定代码的时间复杂度为 O(m) + O(n) + O(m * n) + O((m + n) * log(m + n))。

I definitely can shorten O(m) and O(n), because they are much less than O(m * n), but what should I do with O((m + n) * log(m + n))?我绝对可以缩短 O(m) 和 O(n),因为它们远小于 O(m * n),但是我应该如何处理 O((m + n) * log(m + n))?

How do i compare O(m * n) and O((m + n) * log(m + n))?我如何比较 O(m * n) 和 O((m + n) * log(m + n))? Should I keep O((m + n) * log(m + n)) in final evaluation?我应该在最终评估中保留 O((m + n) * log(m + n)) 吗?

You can treat the total input size as n ;您可以将总输入大小视为n it doesn't really matter which argument contributes what to that total.哪个论点对总数的贡献并不重要。 (The two extremes are when one or the other argument is empty; moving items from one argument to the other doesn't change the overall amount of work you'll be doing.) (两个极端是当一个或另一个参数为空时;将项目从一个参数移动到另一个参数不会改变您将要做的工作总量。)

As such, both set(a) and set(b) are O(n) operations.因此, set(a)set(b)都是 O(n) 运算。

a & b is also O(n); a & b也是 O(n); you don't need to compare every element of a to every element of b to compute the intersection, because sets are hash-based.您不需要将a的每个元素与b每个元素进行比较来计算交集,因为集合是基于哈希的。 You basically just make O(n) constant-time lookups.您基本上只需进行 O(n) 次恒定时间查找。 (I am ignoring the horrific corner case that makes set lookup linear. If you data has a hash function that doesn't map every item to the same value, you won't hit the worst case.) (我忽略了使设置查找线性化的可怕的极端情况。如果您的数据具有一个哈希函数,该函数不会将每个项目都映射到相同的值,那么您将不会遇到最坏的情况。)

sorted(a&b) (no need to create a list first, but that's also just an O(n) operation) takes O(n lg n). sorted(a&b) (不需要先创建一个列表,但这也只是一个 O(n) 操作)需要 O(n lg n)。

Because each of the preceding operations is performed in sequence, the total complexity of get_intersection is O(n lg n).因为前面的每一个操作都是依次执行的,所以get_intersection的总复杂度是O(n lg n)。

I don't think that you can simplify the expression.我不认为你可以简化表达式。

Indeed, if you set m to a constant value, say 5 , you have a complexity事实上,如果你将m设置为一个常数值,比如5 ,你有一个复杂性

5n + (n+5)log(n+5) = O(n log(n))

and the first term is absorbed.并且第一项被吸收。

But if you set m = n ,但是如果你设置m = n

n² + 2n log(2n) = O(n²)

and this time the second term is absorbed.这一次第二项被吸收了。 Hence all you can write is因此你只能写

O(mn + (m+n) log(m+n)).

With a change of variable such that s is the sum and p the product,随着变量的变化, s是总和, p是乘积,

O(p + s log(s)).

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM