[英]hGetContents being too lazy
I have the following snippet of code, which I pass to withFile
: 我有以下代码片段,我将其传递给withFile
:
text <- hGetContents hand
let code = parseCode text
return code
Here hand is a valid file handle, opened with ReadMode
and parseCode
is my own function that reads the input and returns a Maybe. 这里一方面是有效的文件句柄,以打开ReadMode
和parseCode
是我自己的函数读取输入,并返回一个可能。 As it is, the function fails and returns Nothing. 实际上,函数失败并返回Nothing。 If, instead I write: 如果,而是我写:
text <- hGetContents hand
putStrLn text
let code = parseCode text
return code
I get a Just, as I should. 我得到了一个Just,就像我应该的那样。
If I do openFile
and hClose
myself, I have the same problem. 如果我自己做openFile
和hClose
,我也有同样的问题。 Why is this happening? 为什么会这样? How can I cleanly solve it? 我怎样才能干净利落地解决它?
Thanks 谢谢
hGetContents
isn't too lazy, it just needs to be composed with other things appropriately to get the desired effect. hGetContents
不是太懒,它只需要与其他东西合适地组合以获得所需的效果。 Maybe the situation would be clearer if it were were renamed exposeContentsToEvaluationAsNeededForTheRestOfTheAction
or just listen
. 如果它被重命名为exposeContentsToEvaluationAsNeededForTheRestOfTheAction
或者只是listen
,情况会更清楚。
withFile
opens the file, does something (or nothing, as you please -- exactly what you require of it in any case), and closes the file. withFile
打开文件,执行某些操作(或者根本不做任何操作 - 确切地说,无论如何都需要它),并关闭文件。
It will hardly suffice to bring out all the mysteries of 'lazy IO', but consider now this difference in bracketing 揭开“懒惰IO”的所有神秘面纱是不够的,但现在考虑一下这种包围的区别
good file operation = withFile file ReadMode (hGetContents >=> operation >=> print)
bad file operation = (withFile file ReadMode hGetContents) >>= operation >>= print
-- *Main> good "lazyio.hs" (return . length)
-- 503
-- *Main> bad "lazyio.hs" (return . length)
-- 0
Crudely put, bad
opens and closes the file before it does anything; 粗略地说, bad
打开并关闭文件,然后才能执行任何操作; good
does everything in between opening and closing the file. good
打开和关闭文件之间的所有内容。 Your first action was akin to bad
. 你的第一个动作类似于bad
。 withFile
should govern all of the action you want done that that depends on the handle. withFile
应该控制你想要完成的所有操作,这取决于句柄。
You don't need a strictness enforcer if you are working with String
, small files, etc., just an idea how the composition works. 如果您正在使用String
,小文件等,则不需要严格执行器,只需了解组合的工作原理。 Again, in bad
all I 'do' before closing the file is exposeContentsToEvaluationAsNeededForTheRestOfTheAction
. 再次,在关闭文件之前我所做的一切都很bad
的是exposeContentsToEvaluationAsNeededForTheRestOfTheAction
。 In good
I compose exposeContentsToEvaluationAsNeededForTheRestOfTheAction
with the rest of the action I have in mind, then close the file. good
我将exposeContentsToEvaluationAsNeededForTheRestOfTheAction
与我exposeContentsToEvaluationAsNeededForTheRestOfTheAction
的其余动作一起构成,然后关闭文件。
The familiar length
+ seq
trick mentioned by Patrick, or length
+ evaluate
is worth knowing; 帕特里克提到的熟悉的length
+ seq
技巧,或length
+ evaluate
值得了解; your second action with putStrLn txt
was a variant. 你用putStrLn txt
做的第二个动作是一个变种。 But reorganization is better, unless lazy IO is wrong for your case. 但重组更好,除非懒惰IO对你的情况是错误的。
$ time ./bad
bad: Prelude.last: empty list
-- no, lots of Chars there
real 0m0.087s
$ time ./good
'\n' -- right
()
real 0m15.977s
$ time ./seqing
Killed -- hopeless, attempting to represent the file contents
real 1m54.065s -- in memory as a linked list, before finding out the last char
It goes without saying that ByteString and Text are worth knowing about, but reorganization with evaluation in mind is better, since even with them the Lazy variants are often what you need, and they then involve grasping the same distinctions between forms of composition. 不言而喻,ByteString和Text值得了解,但是考虑到评估的重组更好,因为即使使用它们,Lazy变体通常也是你需要的,然后他们就会在构图形式之间理解相同的区别。 If you are dealing with one of the (immense) class of cases where this sort of IO is inappropriate, take a look at enumerator
, conduit
and co., all wonderful. 如果你正在处理这类IO不合适的(巨大的)类别的案例之一,请查看enumerator
, conduit
和公司,这一切都很精彩。
hGetContents
uses lazy IO; hGetContents
使用懒惰的IO; it only reads from the file as you force more of the string, and it only closes the file handle when you evaluate the entire string it returns. 它只会在您强制执行更多字符串时从文件中读取,并且只在评估它返回的整个字符串时才关闭文件句柄。 The problem is that you're enclosing it in withFile
; 问题是你把它放在withFile
; instead, just use openFile
and hGetContents
directly (or, more simply, readFile
). 相反,只需直接使用openFile
和hGetContents
(或者更简单地说,使用readFile
)。 The file will still get closed once you fully evaluate the string. 完全评估字符串后,文件仍将关闭。 Something like this should do the trick, to ensure that the file is fully read and closed immediately by forcing the entire string beforehand: 像这样的东西应该做的,以确保文件完全读取和立即通过强制整个字符串关闭:
import Control.Exception (evaluate)
readCode :: FilePath -> IO Code
readCode fileName = do
text <- readFile fileName
evaluate (length text)
return (parseCode text)
Unintuitive situations like this are one of the reasons people tend to avoid lazy IO these days, but unfortunately you can't change the definition of hGetContents
. 像这样的不直观的情况是人们现在倾向于避免懒惰IO的原因之一,但不幸的是你不能改变hGetContents
的定义。 A strict IO version of hGetContents
is available in the strict package, but it's probably not worth depending on the package just for that one function. 严格的包中提供了严格的IO版本的hGetContents
,但它可能不值得依赖于那个函数的包。
If you want to avoid the overhead that comes from traversing the string twice here, then you should probably look into using a more efficient type than String
, anyway; 如果你想避免在这里两次遍历字符串所产生的开销,那么你应该考虑使用比String
更高效的类型,无论如何; the Text
type has strict IO equivalents for much of the String
-based IO functionality, as does ByteString
(if you're dealing with binary data, rather than Unicode text). 对于大部分基于String
的IO功能, Text
类型具有严格的IO等价物 , ByteString
也是如此 (如果您处理的是二进制数据,而不是Unicode文本)。
You can force the contents of text
to be evaluated using 您可以使用强制评估text
的内容
length text `seq` return code
as the last line. 作为最后一行。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.