简体   繁体   English

Haskell详细列表理解

[英]Verbose List Comprehension in Haskell

I'm using Haskell to find a list of integers from 1 to 10000 that have a special property. 我正在使用Haskell查找具有特殊属性的1到10000整数列表。 I do the following 我做以下

[ number | number <- [1..10000], (isSpecial number)]

However, every now and then I came up with some special properties that are 但是,我时不时地想出一些特殊的属性

  1. hard to be satisfied 很难满足

  2. take a long time to be verified 需要很长时间才能得到验证

As a result, it hangs there after some first few examples. 结果,它在一些头几个例子之后就挂在那里了。

I wonder how I can make the list comprehension in Haskell verbose, so I have a good update about how much Haskell has progressed. 我不知道如何才能使Haskell的列表理解更为详尽,因此我对Haskell的进展情况有了很好的更新。

This is more or less what Robin Zigmond meant: 这或多或少是Robin Zigmond的意思:

checkNumbers :: IO [Int]
checkNumbers = filterM check [1..10000]
    where
        check number = do
            print $ "Checking number" <> show number
            pure $ isSpecial number

This will print "Checking number x" before checking every number. 在检查每个号码之前,这将打印“检查号码x”。 Feel free to experiment with any other effects (or, in your words, "verbosity") within the check function. 随意尝试check功能中的任何其他效果(或用您的话来说,“冗长”)。

Here is a way that requires no IO, instead relying on laziness and your programmer guess about which "side" of the condition happens more often. 这是一种不需要IO的方法,而是依靠懒惰,您的程序员可以猜测条件的哪一侧发生得更频繁。 Just to have something to play with, here's a slightly slow function that checks if a number is a multiple of 10. The details of this function aren't important, feel free to skip it if anything doesn't make sense. 只是为了玩玩,这是一个稍微慢一点的功能,它检查数字是否为10的倍数。此功能的细节并不重要,如果没有任何意义,请随时跳过。 I'm also going to turn on timing reporting; 我还将打开定时报告; you'll see why later. 您稍后将了解原因。

> isSpecial :: Int -> Bool; isSpecial n = last [1..10000000] `seq` (n `mod` 10 == 0)
> :set +s

(Add one 0 every five years.) (每五年加一0

Now the idea will be this: instead of your list comprehension, we'll use partition to split the list into two chunks, the elements that match the predicate and the ones that don't. 现在的想法是:不使用列表理解,我们将使用partition将列表分为两个块,分别与谓词匹配的元素和与谓词不匹配的元素。 We'll print the one of those that has more elements, so we can keep an eye on progress; 我们将打印其中一个具有更多元素的元素,以便我们关注进度。 by the time it's fully printed, the other one will be fully evaluated and we can inspect it however we like. 待其完全印刷时,另一个将得到充分评估,我们可以根据需要对其进行检查。

> :m + Data.List
> (matches, nonMatches) = partition isSpecial [1..20]
(0.00 secs, 0 bytes)
> nonMatches
[1,2,3,4,5,6,7,8,9,11,12,13,14,15,16,17,18,19]
(12.40 secs, 14,400,099,848 bytes)

Obviously I can't portray this over StackOverflow, but when I did the above thing, the numbers in the nonMatches list slowly appeared on my terminal one-by-one, giving a pretty good indicator of where in the list it was currently thinking. 显然,我不能通过StackOverflow来描述这一点,但是当我完成上述操作时, nonMatches列表中的数字会慢慢地一个一出现在我的终端上,从而很好地表明了它当前在想什么。 And now, when you print matches , the full list is available more or less instantly, as you can see by the timing report (ie not another 12-second wait): 现在,当您打印matches时,如您从计时报告中看到的那样(即无需再等待12秒),完整列表或多或少会立即可用:

> matches
[10,20]
(0.01 secs, 64,112 bytes)

But beware! 但是要当心!

  1. It's important that matches and nonMatches have types which are not typeclass polymorphic (ie don't have types that start with Num a => ... or some other constraint). 重要的是, matchesnonMatches类型不是 nonMatches多态的(即,没有以Num a => ...或其他约束开头的类型)。 In the above example, I achieved this by making isSpecial monomorphic, which forces matches and nonMatches to be, too, but if your isSpecial is polymorphic, you should give a type signature for matches or nonMatches to prevent this problem. 在上面的示例中,我通过使isSpecial单态实现这一点,即强制将matchesnonMatches设置为单态,但是如果isSpecial是多态的,则应为matchesnonMatches提供类型签名,以防止出现此问题。

  2. Doing it this way will cause the entire nonMatches and matches lists to be realized in memory. 以这种方式进行操作将使整个nonMatches matchesmatches列表在内存中实现。 This could be expensive if the original list being partitioned is very long. 如果要分区的原始列表很长,这可能会很昂贵。 (But up to, say, a couple hundred thousand Int s is not particularly long for modern computers.) (但是,对于现代计算机而言,几十万Int并不特别长。)

Debug.Trace

You can have a look at Debug.Trace . 您可以看看Debug.Trace It allows printing messages to the console. 它允许将消息打印到控制台。 But as Haskell is lazy, controlling when printing happens is not so easy. 但是由于Haskell很懒,因此控制何时进行打印并非易事。 And this is also not recommended for production: 而且也不建议将其用于生产:

Prelude Debug.Trace> import Debug.Trace
Prelude Debug.Trace> [x | x <- [1..10], traceShow (x, odd x) $ odd x]
(1,True)
[1(2,False)
(3,True)
,3(4,False)
(5,True)
,5(6,False)
(7,True)
,7(8,False)
(9,True)
,9(10,False)
]

We would usually want to see both the tried and the discovered numbers as the calculation goes on. 随着计算的进行,我们通常希望同时看到尝试和发现的数字。

What I usually do is break up the input list into chunks of n elements, filter each chunk as you would the whole list, and convert each chunk into a pair of its head element and the filtered chunk: 我通常要做的是将输入列表分成n元素的块,像对整个列表一样过滤每个块,然后将每个块转换为一对其head元素和已过滤的块:

chunked_result = [ (h, [n | n <- chunk, isSpecial n])
                   | chunk@(h:_) <- chunksOf n input]

Putting such result list through concatMap snd gives the original non-"verbose" option. 通过concatMap snd这样的结果列表将给出原始的非“冗长”选项。

Adjusting the n value will influence the frequency with which the progress will be "reported" when the result list is simply printed, showing both the tried and the discovered numbers, with some inconsequential "noise" around them. 简单地打印结果列表时,调整n值将影响“报告”进度的频率,显示已尝试和已发现的数字,周围带有一些无关紧要的“噪音”。

Using second concat . unzip 使用second concat . unzip second concat . unzip on the chunks results list is somewhat similar to Daniel Wagner's partitioning idea (with caveats), (*) but with your set value of n , not just 1 . 在块结果列表上second concat . unzip有点类似于Daniel Wagner的分区思想(带有警告) (*),您的设置值为n ,而不仅仅是1

If there is an algorithmic slowdown innate to your specific problem, apply the orders of growth run time estimation analysis . 如果存在因特定问题而引起的算法减速,请应用增长顺序运行时间估算 分析


(*) to make it compatible we need to stick some seq in the middle somewhere, like (*)为了使其兼容,我们需要在中间的某个位置粘贴一些seq ,例如

chunked_result = [ (last s `seq` last chunk, s)
                   | chunk <- chunksOf n input
                     let s = [n | n <- chunk, isSpecial n] ]

or something. 或者其他的东西。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM