简体   繁体   English

haskell 长度运行时间 O(1) 或 O(n)

[英]haskell length runtime O(1) or O(n)

I was working on a Haskell assignment and I was trying think of ways to make my code faster.我正在做一个 Haskell 作业,我在想办法让我的代码更快。 For example, my factors function below finds the number of divisors of some integer.例如,我下面的factors 函数可以找到某个整数的除数。

factors :: Int -> Int
factors x = length [n | n <- [1..x], mod x n == 0]

However, it occurred to me that I could make my code faster by avoiding usage of "length".然而,我突然想到,通过避免使用“长度”,我可以使我的代码更快。

factors :: Int -> Int
factors x = go 1
  where
    go :: Int -> Int
    go i
      | i == x        = 1
      | mod x i == 0  = 1 + go (i + 1)
      | otherwise     = go (i + 1)

I was wondering if Haskell's length function is O(n) like strlen() in C or O(1) like String.length() in Java.我想知道 Haskell 的长度函数是 O(n) 像 C 中的 strlen() 还是 O(1) 像 Java 中的 String.length() 。

Also, is there a better or more efficient of writing my code?另外,是否有更好或更有效的编写我的代码?

In my estimation, contrary to the accepted answer, you can in fact infer the complexity of length (and many other functions) just by looking at the definition of [a] :据我估计,与公认的答案相反,您实际上可以通过查看[a]的定义来推断length (和许多其他函数)的复杂性:

Prelude> :info []
data [] a = [] | a : [a]    -- Defined in ‘GHC.Types’

Lists are inductively-defined;列表是归纳定义的; you can see from that definition (which is almost just regular haskell) that at the top level a list is either the constructor [] or : .您可以从该定义(几乎只是常规的haskell)中看到,在顶层列表是构造函数[]: Clearly length must recurse n times on this structure and so would have to be O(n).很明显, length必须在这个结构上递归n次,因此必须是 O(n)。

It's very important to be able to reason at least intuitively in this way, in particular about lists which are ubiquitous.能够以这种方式至少直观地进行推理非常重要,特别是对于无处不在的列表。 eg quick what's the complexity of (!!) ?例如快速(!!)的复杂性是什么?

If you want to do a deep dive into formally reasoning about time complexity in the presence of laziness then you'll need to pick up "Purely Functional Data Structures" by Okasaki.如果您想深入研究在懒惰的情况下对时间复杂度的正式推理,那么您需要阅读 Okasaki 的“纯函数数据结构”。

Also, is there a better or more efficient of writing my code?另外,是否有更好或更有效的编写我的代码?

Integer factorization is one of the most famous problems.整数分解是最著名的问题之一。 There surely have been proposed a lot of algorithms for that, even if I am not expert enough to make a recommendation (CS.SE is around the corner, and can help on that, if needed).肯定已经为此提出了很多算法,即使我不够专业,无法提出建议(CS.SE 即将推出,如果需要,可以提供帮助)。 None of such proposals is polynomial time, but this doesn't stop them to be faster than the trivial approach.这些提议都不是多项式时间,但这并不能阻止它们比平凡方法更快。

Even without looking at the literature, a few simple optimizations can be found.即使不看文献,也可以找到一些简单的优化。

  • The original code scans the whole list [1..x] , but this is not needed.原始代码扫描整个列表[1..x] ,但这不是必需的。 We could stop at sqrt x , since after that there are no longer divisors.我们可以停在sqrt x ,因为在那之后不再有除数。

  • Even more: after we find a divisor m , we could divide x by m (as many times as possible), and recurse with this new number.更重要的是:在我们找到一个除数m ,我们可以将x除以m (尽可能多的次数),并用这个新数字递归。 Eg if x = 1000 after we try m=2 , we compute 1000 -> 500 -> 250 -> 125 , and then find the new divisors (larger than 2 ) in 125 .例如,如果在我们尝试m=2之后x = 1000 ,我们计算1000 -> 500 -> 250 -> 125 ,然后在125找到新的除数(大于2 )。 Note how this made the number much smaller.请注意这如何使数字小得多。

I will leave implementing this strategies in Haskell as an exercise :-P我将在 Haskell 中实施这个策略作为练习:-P

From a theoretical perspective, we can not know whether length is θ(n) , we know that it is O(n) , but it is technically possible that Haskell implements it faster for known lists.从理论的角度来看,我们无法知道length是否为 θ(n) ,我们知道它是O(n) ,但技术上可能 Haskell 为已知列表更快地实现它。

Since a Haskell compiler could be free to implement a list whatever way they want to.由于 Haskell 编译器可以自由地以任何他们想要的方式实现列表。 But nevertheless it does not matter , since in that case generating the list in the first place will take θ(n) .但无论如何都没有关系,因为在这种情况下,首先生成列表将采用θ(n)

Note that even if the compiler uses a more dedicated datastructure, Haskell is lazy, so your list comprehension does not result in a complete list, but more in a function that can generate a list lazily.请注意,即使编译器使用更专用的数据结构,Haskell 也是惰性的,因此您的列表理解不会产生完整的列表,而更多的是在可以懒惰地生成列表的函数中。

Finally if we would evaluate the list comprehension eagerly, then it would again require O(n) to first generate the list in the first place.最后,如果我们急切地评估列表理解,那么首先需要O(n)来首先生成列表。 So even if obtaining the length was very fast, then generating the list would require O(n) as a lower bound.因此,即使获取长度非常快,生成列表也需要O(n)作为下限。 So regardless what the efficiency of length is, the algorithm will still scale linearly with the input.因此,无论length的效率如何,该算法仍将随输入线性缩放。

Your own implementation again uses O(n) (and is not very safe to be honest).您自己的实现再次使用O(n) (老实说并不是很安全)。 Nevertheless, you can easily speedup the factorization of a number to O(sqrt n) :不过,您可以轻松地将数字的因式分解加速为O(sqrt n)

factors :: Int -> Int
factors x = go 1
  where
    go :: Int -> Int
    go i | i2 > x = 0
         | i2 == x = 1
         | mod x i == 0 = 2 + go (i+1)
         | otherwise = go (i + 1)
        where i2 = i*i

Here we enumerate from 1 to sqrt(n).这里我们枚举从 1 到 sqrt(n)。 Each time we find a factor a , we know that there is a co -factor b = x/a .每次我们找到一个因子a ,我们就知道有一个因子b = x/a As long as a is not equal to sqrt(x), we know that those are different.只要a不等于 sqrt(x),我们就知道它们是不同的。 In case a is equal to sqrt(x), we know that a is equal to b and thus we count this as one.如果a等于 sqrt(x),我们知道a等于b ,因此我们将其视为 1。

That being said, there are definitely faster ways to do it.话虽如此,肯定有更快的方法来做到这一点。 It is a topic with a lot of research that has yielded more efficient algorithms.这是一个经过大量研究的主题,已经产生了更有效的算法。 I'm not suggesting that the above is the fastest, but it is definitely a huge improvement in terms of time complexity.我并不是说以上是最快的,但在时间复杂度方面绝对是一个巨大的改进。

Building the list with the list comprehension already takes O(n) .使用列表理解构建列表已经需要O(n) Therefore there is not much overhead when using a length function which should have complexity O(n) in the worst case.因此,在使用最坏情况下复杂度为O(n)的长度函数时,开销并不大。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM