简体   繁体   English

Haskell列表理解是否效率低下?

[英]Are Haskell List Comprehensions Inefficient?

I started doing Project Euler and got to problem number 9 . 我开始做项目欧拉并得到问题9 Since I was using Project Euler to learn Haskell, I decided to use list comprehensions (as shown in Learn You A Haskell ). 由于我使用Project Euler来学习Haskell,我决定使用列表推导(如Learn You A Haskell中所示 )。 I do that and GHCI takes awhile to figure out the triplet, which I figured is normal because of the calculations involved. 我这样做,GHCI需要一段时间来弄清楚三元组,由于涉及的计算,我认为这是正常的。 Now, at work yesterday (I don't work as a programmer professionally, yet) I was talking to a friend who knows VBA and he wanted to try to find the answers in VBA. 现在,昨天上班(我不是专业的程序员),我正在和一位了解VBA的朋友交谈,他想尝试在VBA中找到答案。 I thought it would be a fun challenge as well, and I churn out some basic for loops and if statements, but what got me was that it was much faster than Haskell was. 我认为这也是一个有趣的挑战,我为ch循环和if语句制作了一些基本的东西,但是我得到的是它比Haskell快得多。

My question is: are Haskell's list comprehension incredibly inefficient? 我的问题是:Haskell的列表理解是否非常低效? At first I thought it was just because I was in GHC's interactive mode, but then I realized VBA is interpreted too. 起初我以为这只是因为我处于GHC的交互模式,但后来我意识到VBA也被解释了。

Please note, I didn't post my code because of it being an answer to project euler. 请注意,我没有发布我的代码,因为它是项目euler的答案。 If it will answer my question (as in I'm doing something wrong) then I will gladly post the code. 如果它会回答我的问题(因为我做错了),那么我很乐意发布代码。

[edit] Here is my Haskell list comprehension: [编辑]这是我的Haskell列表理解:
[(a,b,c) | c <- [1..1000], b <- [1..c], a <- [1..b], a+b+c=1000, a^2+b^2=c^2]
I guess I could've lowered the range on c but is that what is really slowing it down? 我想我可以降低c的范围,但这是真的放慢了它的速度吗?

There are two things you could be doing with this problem that could make your code slow. 你可以用这个问题做两件事,这会让你的代码变慢。 One is how you are trying values for a, b and c. 一个是你如何尝试a,b和c的值。 If you loop through all possible values for a, b, c from 1 to 1000, you'll be spending a long time. 如果你将a,b,c的所有可能值从1循环到1000,那么你将花费很长时间。 To give a hint, you can make use of a+b+c=1000 if you rearrange it for c. 要给出提示,如果将其重新排列为c,则可以使用+ b + c = 1000。 The other is that if you only use a list comprehension, it will process every possible value for a, b and c. 另一个是如果你只使用列表推导,它将处理a,b和c的每个可能的值。 The problem tells you that there is only one unique set of numbers that satisfies the problem, so if you change your answer from this: 问题告诉您,只有一组唯一的数字可以解决问题,因此如果您更改答案:

[ a * b * c | .... ]

to: 至:

head [ a * b * c | ... ]

then Haskell's lazy evaluation means that it will stop after finding the first answer. 然后Haskell的懒惰评估意味着它会在找到第一个答案后停止。 This is the Haskell equivalent of breaking out of your VBA loop when you find the first answer. 当你找到第一个答案时,这就是Haskell等同于打破你的VBA循环。 When I used both these tips, I had an answer that completed very quickly (under a second) in ghci. 当我使用这两个技巧时,我得到的答案很快就完成了(在一秒钟内)ghci。

Addendum: I missed at first the condition a < b < c. 附录:我最初错过了条件a <b <c。 You can also make use of this in your list comprehensions; 您也可以在列表推导中使用它; it is valid to say things along the lines of: 说出以下内容是有效的:

[(a, b) | b <- [1..100], a <- [1..b-1]]

Consider this simplified version of your list comprehension: 考虑一下列表理解的简化版本:

[(a,b,c) | a <- [1..1000], b <- [1..1000], c <- [1..1000]]

This will give all possible combinations of a, b, and c. 这将给出a,b和c的所有可能组合 It's kind of like saying, "how many ways can three one-thousand-sided dice land?" 这有点像说,“三千面骰子可以通过多少种方式登陆?” The answer is 1000*1000*1000 = 1,000,000,000 different combinations. 答案是1000 * 1000 * 1000 = 1,000,000,000种不同的组合。 If it took 0.001 seconds to generate each combination, it would therefore take 1,000,000 seconds (~11.5 days) to finish all combinations. 如果生成每个组合需要0.001秒,那么完成所有组合需要1,000,000秒(~11.5天)。 (OK, 0.001 seconds is actually pretty slow for a computer, but you get the idea) (好吧,对于一台电脑来说,0.001秒实际上很慢,但你明白了)

When you add predicates to your list comprehension, it still takes the same amount of time to compute; 将谓词添加到列表推导中时,它仍然需要相同的时间来计算; in fact, it takes longer since it needs to check the predicate for each of the 1 billion combinations it computes. 事实上,它需要更长的时间,因为它需要检查它计算的10亿个组合中的每个组合的谓词。

Now consider your comprehension. 现在考虑一下你的理解力。 It looks like it should be much faster, right? 它看起来应该快得多,对吧?

[(a,b,c) | c <- [1..1000], b <- [1..c], a <- [1..b], a+b+c=1000, a^2+b^2=c^2]

There are 1000 choices for c. c有1000种选择。 How many are there for b and a? b和a有多少? Well, the average choice for c is 500. For all choices of c, then, there are an average of 500 choices for b (since b can range from 1 to c). 那么,c的平均选择是500.对于c的所有选择,那么b的平均有500个选择(因为b的范围从1到c)。 Likewise, for all choices of c and b, there are an average of 250 choices for a. 同样,对于c和b的所有选择,a平均有250种选择。 That's very hand-wavy, but I'm fairly sure it's accurate. 这是非常手工波浪,但我相信它是准确的。 So 1000 choices for c * 1000/2 choices for b * 1000/4 choices for a = 1 billion / 8 ~= 100 million. 因此,对于a * 10亿/ 8~ = 1亿,b * 1000/4选择的c * 1000/2选择有1000种选择。 It's 8x faster, but if you paid attention, you'll notice it's actually the same big-Oh complexity as the simplified version above. 它的速度提高了8倍,但是如果你注意到它,你会注意到它实际上上面的简化版本相同。 If we compared "simplified" vs "improved" versions of the same problem, but from [1..100000] instead of [1..1000], the "improved" would still only be 8x faster than the "simplified". 如果我们比较同一问题的“简化”和“改进”版本,但是从[1..100000]而不是[1..1000],“改进”仍然只比“简化”快8倍。

Don't get me wrong, 8x is a wonderful constant-factor speedup. 不要误会我的意思,8x是一个很好的恒定因素加速。 But unless you want to wait a couple hours to get the solution, you'll need to get a better big-Oh. 但除非你想等几个小时才能得到解决方案,否则你需要得到一个更好的大哦。

As Neil noted, the way to reduce the complexity of this problem is, for a given b and c , choose the a that satisfies a+b+c=1000 . 正如尼尔指出,该办法以减少这一问题的复杂性,对于一个给定的bc ,选择a满足a+b+c=1000 That way, you're not trying a bunch of a s that will fail. 这样一来,你不是想一堆的a s表示将失败。 This will drop the big-Oh complexity; 降低大的复杂性; you'll only be considering approximately 1000 * 500 * 1 = 500,000 combinations, instead of ~100,000,000. 你只会考虑大约1000 * 500 * 1 = 500,000组合,而不是~100,000,000。

Once you get the solution to the problem you can check out other peoples versions of Haskell solutions on the Project Euler site to get an idea of how other people have solved the problem. 一旦获得问题的解决方案,您可以在Project Euler站点上查看其他人的Haskell解决方案版本,以了解其他人如何解决问题。 Incidentally, here is a link to the referenced problem: http://projecteuler.net/index.php?section=problems&id=9 顺便提一下,这里是引用问题的链接: http//projecteuler.net/index.php?section = problem& id = 9

In addition to what everyone else has said about generating fewer elements in the generators, you can also switch to using Int instead of Integer as the type of the numbers. 除了其他人所说的关于在生成器中生成更少元素的内容之外,您还可以切换到使用Int而不是Integer作为数字的类型。 The default is Integer, but your numbers are small enough to fit in an Int. 默认值为Integer,但您的数字足够小以适合Int。

(Also, to nitpick, Haskell list comprehensions have no speed. Haskell is a language definition with very little operational semantics. A particular Haskell implementation might have slow list comprehensions, though.) (另外,对于nitpick,Haskell列表推导没有速度.Haskell是一种语言定义,具有非常少的操作语义。但是,特定的Haskell实现可能具有缓慢的列表推导。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM