简体   繁体   English

使用 IO 理解 Haskell 中的纯函数

[英]Understanding pure functions in Haskell with IO

Given a Haskell value (edit per Rein Heinrich 's comment ) f :给定 Haskell(根据Rein Heinrich评论进行编辑) f

f :: IO Int
f = ... -- ignoring its implementation

Quoting "Type-Driven Development with Idris,"引用“使用 Idris 进行类型驱动的开发”,

The key property of a pure function is that the same inputs always produce the same result.纯函数的关键特性是相同的输入总是产生相同的结果。 This property is known as referential transparency此属性称为引用透明度

Is f , and, namely all IO ... functions in Haskell, pure? f ,即 Haskell 中的所有IO ...函数是纯函数吗? It seems to me that they are not since, lookInDatabase :: IO DBThing , won't always return the same value since:在我看来,它们不是因为lookInDatabase :: IO DBThing不会总是返回相同的值,因为:

  • at t=0, the DB might be down在 t=0 时,DB 可能已关闭
  • at t=1, the DB might be up and return MyDbThing would result在 t=1 时,数据库可能已启动并return MyDbThing将导致

In short, is f (and IO ... functions in general) pure?简而言之, f (和IO ...一般的函数)是纯的吗? If yes, then please correct my incorrect understanding given my attempt to disprove the functional purity of f with my t=... examples.如果是,那么请纠正我的错误理解,因为我试图用我的t=...示例反驳f的功能纯度。

IO is really a separate language, conceptually.从概念上讲,IO 确实是一种独立的语言。 It's the language of the Haskell RTS (runtime system).它是 Haskell RTS(运行时系统)的语言。 It's implemented in Haskell as a (relatively simple) embedded DSL whose "scripts" have the type IO a .它在 Haskell 中作为(相对简单的)嵌入式 DSL 实现,其“脚本”的类型为IO a

So Haskell functions that return values of type IO a , are actually not the functions that are being executed at runtime — what gets executed is the IO a value itself.因此,返回IO a类型值的 Haskell 函数实际上并不是在运行时执行的函数——执行的是IO a值本身。 So these functions actually are pure but their return values represent non-pure computations.所以这些函数实际上纯函数,但它们的返回值代表非纯计算。

From a language design point of view, IO is a really elegant hack to keep the non-pure ugliness completely isolated away while at the same integrating it tightly into its pure surroundings, without resorting to special casing.从语言设计的角度来看,IO 是一个非常优雅的 hack,它可以将非纯粹的丑陋完全隔离开,同时将其紧密地集成到其纯粹的环境中,而无需求助于特殊的外壳。 In other words, the design does not solve the problems caused by impure IO but it does a great job of at least not affecting the pure parts of your code.换句话说,该设计并没有解决由不纯 IO 引起的问题,但它在至少不影响代码的纯部分方面做得很好。


The next step would be to look into FRP — with FRP you can make the layer that contains IO even thinner and move even more of non-pure logic into pure logic.下一步是研究 FRP——使用 FRP,您可以使包含 IO 的层更薄,并将更多的非纯逻辑移动到纯逻辑中。

You might also want to read John Backus' writings on the topic of Function Programming, the limitations of the Von Neumann architecture etc. Conal Elliott is also a name to google if you're interested in the relationship between purity and IO.您可能还想阅读 John Backus 关于函数编程、冯诺依曼架构的局限性等主题的著作。如果您对纯度和 IO 之间的关系感兴趣,Conal Elliott 也是谷歌的名字。


PS also worth noting is that while IO is heavily reliant on monads to work around an aspect of lazy evaluation, and because monads are a very nice way of structuring embedded DSLs (of which IO is just a single example), monads are much more general than IO, so try not to think about IO and monads in the same context too much — they are two separate things and both could exist without the other. PS 还值得注意的是,虽然 IO 严重依赖 monads 来解决惰性求值的一个方面,而且因为 monads 是构建嵌入式 DSL 的一种非常好的方式(其中 IO 只是一个例子),monads 更通用而不是 IO,所以尽量不要在同一个上下文中过多地考虑 IO 和 monad——它们是两个独立的东西,两者都可以在没有另一个的情况下存在。

First of all, you're right in noticing that I/O actions are not pure.首先,您注意到 I/O 操作不是纯操作是正确的。 That's impossible.这不可能。 But, purity in all functions is one of Haskell's promising points, so what's happening?但是,所有函数的纯度是 Haskell 的有希望的点之一,那么发生了什么?

Whether you like it or not, a function that applies into a (may also be incorrectly said "returns a") IO Something with some arguments will always return the same IO Something with the same arguments.不管你喜欢与否,一个函数应用于 a (也可能被错误地说“返回 a”) IO Something带有一些参数的IO Something始终返回相同的IO Something带有相同参数的IO Something The IO monad allows you to "hide" actions inside of the container the monad acts like. IO monad 允许您“隐藏”monad 行为的容器内的操作。 When you have a IO String , that function/object does not contain a String / [Char] , but rather sort of a promise that you'll get that String somehow in the future.当你有一个IO String ,该函数/对象包含String / [Char] ,而是那种一个承诺,你会得到String莫名其妙的未来。 Thus, IO contains information of what to do when the impure I/O action needs to be performed.因此, IO包含在需要执行非纯 I/O 操作时要做什么的信息。

After all, the only way for an IO action to be performed is by it having the name main , or be a dependency of main thereof.毕竟,执行IO操作的唯一方法是使用名称main ,或者是main的依赖项。 Because of the flexibility of monads, you can "concatenate" IO actions.由于 monad 的灵活性,您可以“连接” IO操作。 A program like this... (note: this code is not a good idea)像这样的程序......(注意:这段代码不是一个好主意)

main = do
    input <- getLine
    putStrLn input

Is syntatic sugar for...语法糖是...

main =
    getLine >>= (\input -> putStrLn input)

That would state that main is the I/O action resulting from printing to standard output a string read from standard input, followed by a newline character.这将说明main是从标准输入读取的字符串打印到标准输出所产生的 I/O 操作,后跟换行符。 Did you saw the magic?你看到魔法了吗? IO is just a wrapper representing what to do , in an impure context, to produce some given output, but not the result of that operation, because that would need the Haskell language to admit impure code. IO只是一个包装器,表示在不纯的上下文中要做什么,以产生一些给定的输出,而不是该操作的结果,因为这需要 Haskell 语言来承认不纯的代码。

Think of it as sort of a receipe.把它想象成一种收据。 If you have a receipe (read: IO monad) for a cake (read: Something in IO Something ), you know how to make the cake, but you can't make the cake (because you could screw that masterpiece).如果你有蛋糕的收据(读作: IO monad)(读作: Something in IO Something ),你知道如何做蛋糕,但你不能做蛋糕(因为你可以搞砸那个杰作)。 Instead, the master chief (read: the most basic parts of the Haskell runtime system, responsible for applying main ) does the dirty work for you (read: doing impure/illegal stuff), and, the best of all, he won't commit any mistakes (read: breaking code purity)... unless the oven breaks of course (read: System.IO.Error ), but he knows how to clean that up (read: code will always remain pure).相反,首席主管(阅读:Haskell 运行时系统的最基本部分,负责应用main )为你做肮脏的工作(阅读:做不纯/非法的事情),而且,最重要的是,他不会犯任何错误(阅读:破坏代码纯度)......当然,除非烤箱坏了(阅读: System.IO.Error ),但他知道如何清理它(阅读:代码将始终保持纯净)。

This is one of the reasons that IO is an opaque type.这是IO是不透明类型的原因之一。 It's implementation is somewhat controversial (until you read GHC's source code), and is better of to be left as implementation-defined.它的实现有些争议(直到您阅读 GHC 的源代码),最好保留为实现定义。

Just be happy, because you've been illuminated by purity.开心就好,因为你已经被纯洁照亮了。 A lot of programmers don't even know of Haskell's existence!很多程序员甚至不知道 Haskell 的存在!

I hope this has led some light on you!我希望这对你有所启发!

Haskell is pulling a trick here. Haskell 在这里耍了个花招。 IO both is and isn't pure, depending on how you look at it. IO 既是纯的又不是纯的,这取决于您如何看待它。

On the "IO is pure" side, you're fallen into the very common error of thinking of a function returning an IO DBThing as of it were returning a DBThing .在“IO 是纯的”方面,您陷入了一个非常常见的错误,认为函数返回IO DBThing就像它返回DBThing When someone claims that a function with type Stuff -> IO DBThing is pure they are not saying that you can feed it the same Stuff and always get the same DBThing ;当有人声称类型为Stuff -> IO DBThing是纯函数时,他们并不是说您可以提供相同的Stuff并始终获得相同的DBThing as you correctly note that is impossible, and also not very useful!正如您正确指出的那样,这是不可能的,而且也不是很有用! What they're saving is that given particular Stuff you'll always get back the same IO DBThing .他们节省的是,给定特定的Stuff您将始终获得相同的IO DBThing

You actually can't get a DBThing out of an IO DBThing at all, so Haskell don't ever have to worry about the database containing different values (or being unavailable) at different times.实际上,您根本无法从IO DBThing中获取DBThing ,因此 Haskell 不必担心数据库在不同时间包含不同值(或不可用)。 All you can do with an IO DBThing is combine it with something else that needs a DBThing and produces some other kind of IO thing ;使用IO DBThing就是将它与其他需要 DBThing 的东西结合起来,并产生一些其他类型的IO thing the result of such a combination is an IO thing .这样组合的结果是一个IO thing

What Haskell is doing here is building up a correspondence between manipulation of pure Haskell values and changes that would happen out in the world outside the program. Haskell 在这里所做的是在对纯 Haskell 值的操作与程序之外的世界中发生的变化之间建立对应关系。 There are things you can do with some ordinary pure values that don't make any sense with impure operations like altering the state of a database.有些事情你可以用一些普通的纯值来做,而对不纯操作(比如改变数据库的状态)没有任何意义。 So using the correspondence between IO values and the outside world, Haskell simply doesn't provide you with any operations on IO values that would correspond to things that don't make sense in the real world.因此,使用IO值与外部世界之间的对应关系,Haskell 根本不会为您提供对IO值的任何操作,这些操作对应于现实世界中没有意义的事物。

There are several ways to explain how you're "purely" manipulating the real world.有几种方法可以解释您如何“纯粹地”操纵现实世界。 One is to say that IO is just like a state monad, only the state being threaded through is the entire world outside your program;= (so your Stuff -> IO DBThing function really has an extra hidden argument that receives the world, and actually returns a DBThing along with another world; it's always called with different worlds, and that's why it can return different DBThing values even when called with the same Stuff ).一种是说IO就像一个状态单子,只有被线程化的状态才是你程序之外的整个世界;=(所以你的Stuff -> IO DBThing函数确实有一个额外的隐藏参数来接收世界,实际上返回一个DBThing和另一个世界;它总是用不同的世界调用,这就是为什么即使使用相同的Stuff调用它也可以返回不同的DBThing值)。 Another explanation is that an IO DBThing value is itself an imperative program;另一种解释是IO DBThing值本身就是一个命令式程序; your Haskell program is a totally pure function doing no IO, which returns an impure program that does IO, and the Haskell runtime system (impurely) executes the program it returns.你的 Haskell 程序是一个完全不做 IO 的纯函数,它返回一个不纯的执行 IO 的程序,而 Haskell 运行时系统(不纯)执行它返回的程序。

But really these are both simply metaphors.但实际上,这些都只是比喻。 The point is that the IO value simply has a very limited interface which doesn't allow you to do anything that doesn't make sense as a real world action.关键是IO值只是一个非常有限的接口,它不允许你做任何在现实世界中没有意义的动作。

Note that the concept of monad hasn't actually come into this.请注意, monad的概念实际上并没有出现在这里。 Haskell's IO system really doesn't depend on monads; Haskell 的 IO 系统真的不依赖于 monad; Monad is just a convenient interface which is sufficiently limited that if you're only using the generic monad interface you also can't break the IO limitations (even if you don't know your monad is actually IO). Monad只是一个方便的接口,它有足够的限制,如果你只使用通用的 monad 接口,你无法打破 IO 限制(即使你不知道你的 monad 实际上是 IO)。 Since the Monad interface is also interesting enough to write a lot of useful programs, the fact that IO forms a monad allows a lot of code that's useful on other types to be generically reused on IO .由于Monad接口也很有趣,可以编写许多有用的程序,因此IO形成 monad 的事实允许许多对其他类型有用的代码在IO上通用重用。

Does this mean you actually get to write pure IO code?这是否意味着您实际上可以编写纯 IO 代码? Not really.并不真地。 This is the "of course IO isn't pure" side of the coin.这是硬币的“当然 IO 不纯”的一面。 When you're using the fancy "combining IO functions together" you still have to think about your program executing steps one after the other (or in parallel), affecting and being affected by outside conditions and systems;当您使用花哨的“将 IO 功能组合在一起”时,您仍然必须考虑您的程序一个接一个(或并行)执行步骤,影响和受外部条件和系统的影响; in short exactly the same kind of reasoning you have to use to write IO code in an imperative language (only with a nicer type system than most of them).简而言之,这与您在命令式语言中编写 IO 代码所必须使用的推理类型完全相同(仅使用比大多数语言更好的类型系统)。 Making IO pure doesn't really help you banish impurity from the way you have to think about your code.使 IO 纯净并不能真正帮助您从必须考虑代码的方式中消除杂质。

So what's the point?那么有什么意义呢? Well for one, it gets us a compiler-enforced demarcation of code that can do IO and code that can't.一方面,它为我们提供了一个编译器强制的区分可以执行 IO 的代码和不能执行 IO 的代码。 If there's no IO tag on the type then impure IO isn't involved.如果类型上没有IO标签,则不涉及不纯的 IO。 That would be useful in any language just on its own.在任何语言中都是有用的。 And the compiler knows this too;编译器也知道这一点; Haskell compilers can apply optimizations to non-IO code that would be invalid in most other languages because it's often impossible to know that a given section of code doesn't have side effects (unless you can see the full implementation of everything the code calls, transitively). Haskell 编译器可以对在大多数其他语言中无效的非 IO 代码应用优化,因为通常不可能知道给定的代码部分没有副作用(除非您可以看到代码调用的所有内容的完整实现,传递)。

Also, because IO is pure, code analysis tools (including your brain) don't have to treat IO-code specially.此外,由于 IO 是纯粹的,代码分析工具(包括您的大脑)不必特别对待 IO 代码。 If you can pick out a code transformation that would be valid on pure code with the same structure as the IO code, you can do it on the IO code.如果您可以挑选出一个代码转换,该代码转换对与 IO 代码具有相同结构的纯代码有效,则可以在 IO 代码上进行。 Compilers make use of this.编译器利用了这一点。 Many transformations are ruled out by the structure that IO code must use (in order to stay within the bounds of things that have a sensible correspondence to things in the outside world) but they would also be ruled out by any pure code that used the same structure;许多转换被 IO 代码必须使用的结构排除在外(为了保持在与外界事物有合理对应关系的事物的范围内),但它们也将被任何使用相同结构的纯代码排除结构体; the careful construction of the IO interface makes "execution order dependency" look like ordinary "data dependency", so you can just use the rules of data dependency to determine the rules of using IO. IO接口的精心构建,让“执行顺序依赖”看起来像普通的“数据依赖”,所以你可以只用数据依赖的规则来确定使用IO的规则。

Short answer: Yes, that f is referential transparent.简短回答:是的, f是引用透明的。

Whenever you look at it, it equals the same value.无论何时查看它,它都等于相同的值。
But that doesn't mean it will always bind the same value.但这并不意味着它总是绑定相同的值。

In short, is f (and IO ... functions in general) pure?简而言之, f (和IO ...一般的函数)是纯的吗?

So what you're really asking is:所以你真正要问的是:

Are IO definitions in Haskell pure? Haskell 中的IO定义是纯的吗?

You're really not going to like it.你真的不会喜欢它。

Deep Thought.深层思想。

We don't know .我们不知道

From section 6.1.7 (page 75) of the Haskell 2010 report :来自Haskell 2010 报告的第 6.1.7 节(第 75 页):

The IO type serves as a tag for operations (actions) that interact with the outside world. IO类型用作与外部世界交互的操作(动作)的标记。 The IO type is abstract: no constructors are visible to the user. IO类型是抽象的:没有构造函数对用户可见。 IO is an instance of the Monad and Functor classes. IOMonadFunctor类的一个实例。

the crucial point being:关键是:

The IO type is abstract IO类型是抽象的

There is no "standard definition" of the IO type, so there's no way to determine if it's pure, let alone an expression of that type. IO类型没有“标准定义”,因此无法确定它是否纯,更不用说该类型的表达式了。 We can't even provide a simple proof that IO is monadic (ie it satisfies the monad laws ) as return and (>>=) cannot be defined in standard Haskell 2010.我们甚至无法提供一个简单的证据证明IO是 monadic(即它满足monad 定律),因为return(>>=)不能在标准 Haskell 2010 中定义。

To get some idea on how this affects the determining of various IO -related properties, see:要了解这如何影响各种IO相关属性的确定,请参阅:

So when you next hear or read about Haskell being "referentially transparent" or "purely functional", you now know that (at least for I/O) they're just conjectures - no actual standard definition means there's no way to prove or disprove them.所以当你下次听到或读到 Haskell 是“引用透明的”或“纯粹的功能”时,你现在知道(至少对于 I/O)它们只是猜测——没有实际的标准定义意味着没有办法证明或反驳他们。

(If you're now wondering how Haskell got into this state, I provide some more details here .) (如果您现在想知道 Haskell 是如何进入这种状态的,我会在此处提供更多详细信息。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM