简体   繁体   English

Haskell:为什么模式匹配在不成为相等实例的情况下适用于类型?

[英]Haskell: Why does pattern matching work for types without being instances of equality?

I was wondering how pattern matching in Haskell works. 我想知道Haskell中的模式匹配如何工作。 I am aware of this thread , but could not quite understand the answers therein. 我知道此线程 ,但无法完全理解其中的答案。

  • The answers state that types are matched by Boolean expressions, but how is this possible? 答案指出类型由布尔表达式匹配,但这怎么可能呢?
  • The other thing I got was pattern matching is more general than (==) and Eq is implemented by use of pattern matching. 我得到的另一件事是模式匹配比(==)更通用,并且使用模式匹配来实现Eq

Can you tell me why match and match3 are working even if I omit deriving (Eq) in the following code snippet, (it is clear why match2 is failing)? 即使我在以下代码段中省略了deriving (Eq) ,您能否告诉我matchmatch3为什么match2 (很明显match2为什么会失败)?

data MyType = TypeA | TypeB
            deriving (Eq)

match :: MyType -> String
match TypeA = "this is type A"
match TypeB = "this is type B"

match2 :: MyType -> String
match2 a | a == TypeA = "this is type A matched by equality"
         | a == TypeB = "this is type B matched by equality"
         | otherwise = "this is neither type A nor type B"

match3 :: MyType -> String
match3 a = case a of TypeA -> "this is type A matched by case expression"
                     TypeB -> "this is type B matched by case expression"

main :: IO ()
main = do
    (print . match) TypeA
    (print . match) TypeB
    (print . match2) TypeA
    (print . match2) TypeB
    (print . match3) TypeA
    (print . match3) TypeB

I just want to point out that data types and pattern matching (to a first approximation) are merely useful but redundant language features, that can be implemented purely using lambda calculus. 我只想指出,数据类型和模式匹配(至第一近似值)仅是有用但冗余的语言功能,可以完全使用lambda演算来实现。 If you understand how to implement them in lambda calculus, you can understand why they don't need Eq to implement pattern matching. 如果您了解如何在Lambda演算中实现它们,则可以理解为什么它们不需要Eq来实现模式匹配。

Implementing data types in lambda calculus is known as "Church-encoding" them (after Alonzo Church , who demonstrated how to do this). 在lambda演算中实现数据类型被称为“教会编码”(在Alonzo Church之后 ,他演示了如何实现)。 Church-encoded functions are also known as "Continuation-passing style". 教会编码的功能也称为“继续传递样式”。

It's called "continuation-passing style" because instead of providing the value, you provide a function that will process the value. 之所以称为“连续传递样式”,是因为您提供了一个处理该值的功能,而不是提供该值。 So for example, instead of giving a user an Int , I could instead give them a value of the following type: 因此,例如,代替给用户一个Int ,我可以给他们一个以下类型的值:

type IndirectInt = forall x . (Int -> x) -> x

The above type is "isomorphic" to an Int . 上面的类型对于Int是“同构的”。 "Isomorphic" is just a fancy way of saying that we can convert any IndirectInt into an Int : “同构”只是一种奇特的说法,我们可以将任何IndirectInt转换为Int

fw :: IndirectInt -> Int
fw indirect = indirect id

... and we can convert any Int into an IndirectInt : ...,我们可以将任何Int转换为IndirectInt

bw :: Int -> IndirectInt
bw int = \f -> f int

... such that: ...这样:

fw . bw = id -- Exercise: Prove this
bw . fw = id -- Exercise: Prove this

Using continuation-passing style, we can convert any data type into a lambda-calculus term. 使用连续传递样式,我们可以将任何数据类型转换为lambda-calculus术语。 Let's start with a simple type like: 让我们从一个简单的类型开始:

data Either a b = Left a | Right b

In continuation-passing style, this would become: 以连续传递样式,这将变为:

type IndirectEither a b = forall x . (Either a b -> x) -> x

But Alonzo Church was smart and noticed that for any type with multiple constructors, we can just provide a separate function for each constructor. 但是Alonzo Church很聪明,并且注意到对于具有多个构造函数的任何类型,我们只能为每个构造函数提供一个单独的函数。 So in the case of the above type, instead of providing a function of type (Either ab -> x) , we can instead provide two separate functions, one for the a , and one for the b , and that would be just as good: 因此,在上述类型的情况下,我们可以提供两个单独的函数,一个用于a ,另一个用于b ,而不是提供类型(Either ab -> x)函数,那样就好:

type IndirectEither a b = forall x . (a -> x) -> (b -> x) -> x
-- Exercise: Prove that this definition is isomorphic to the previous one

What about a type like Bool where the constructors have no arguments? Bool这样的构造函数没有参数的类型呢? Well, Bool is isomorphic to Either () () (Exercise: Prove this), so we can just encode it as: 好吧, BoolEither () ()是同构的(练习:证明这一点),因此我们可以将其编码为:

type IndirectBool = forall x . (() -> x) -> (() -> x) -> x

And () -> x is just isomorphic to x (Exercise: Prove this), so we can further rewrite it as: () -> x仅仅是同构于x (练习:证明了这一点),所以我们可以进一步把它改写为:

type IndirectBool = forall x . x -> x -> x

There are only two functions that can have the above type: 只有两个函数可以具有上述类型:

true :: a -> a -> a
true a _ = a

false :: a -> a -> a
false _ a = a

Because of the isomorphism, we can guarantee that all Church encodings will have as many implementations as there were possible values of the original data type. 由于同构,我们可以保证所有Church编码将具有与原始数据类型可能的值一样多的实现。 So it's no coincidence that there are exactly two functions that inhabit IndirectBool , just like there are exactly two constructors for Bool . 因此,恰好有两个函数驻留在IndirectBool ,就像Bool恰好有两个构造函数一样,这并非巧合。

But how do we pattern-match on our IndirectBool ? 但是我们如何在IndirectBool上进行模式匹配? For example, with an ordinary Bool , we could just write: 例如,对于普通的Bool ,我们可以这样写:

expression1 :: a
expression2 :: a

case someBool of
    True  -> expression1
    False -> expression2

Well, with our IndirectBool it already comes with the tools to deconstruct itself. 好吧,有了我们的IndirectBool它已经带有解构自身的工具。 We would just write: 我们只写:

myIndirectBool expression1 expression2

Notice that if myIndirectBool is true , it will pick the first expression, and if it is false it will pick the second expression, just as if we had somehow pattern-matched on its value. 注意,如果myIndirectBooltrue ,它将选择第一个表达式,如果为false ,它将选择第二个表达式,就像我们对其值进行某种模式匹配一​​样。

Let's try to do the same thing with an IndirectEither . 让我们尝试使用IndirectEither进行相同的操作。 Using an ordinary Either , we'd write: 使用普通的Either ,我们将编写:

f :: a -> c
g :: b -> c

case someEither of
    Left  a -> f a
    Right b -> g b

With the IndirectEither , we'd just write: 使用IndirectEither ,我们只需编写:

someIndirectEither f g

In short, when we write types in continuation-passing style, the continuations are like the case statements of a case construct, so all we are doing is passing each different case statement as arguments to the function. 简而言之,当我们以连续传递样式编写类型时,连续就像是case构造的case语句,因此我们要做的就是将每个不同的case语句作为参数传递给函数。

This is the reason you don't need any sense of Eq to pattern-match on a type. 这就是为什么不需要任何Eq来对类型进行模式匹配的原因。 In lambda calculus, the type decides what it is "equal" to, simply by defining which argument it selects out of the ones provided to it. 在lambda演算中,类型通过简单地定义从提供给它的参数中选择哪个参数来决定其“等于”什么。

So if I'm a true , I prove I am "equal" to true by always selecting my first argument. 因此,如果我是true ,那么总是选择第一个参数来证明我“对”等于true If I'm a false , I prove I am "equal" to false by always selecting my second argument. 如果我是false ,则通过始终选择第二个参数来证明我“等同于” false In short, constructor "equality" boils down to "positional equality", which is always defined in a lambda calculus, and if we can distinguish one parameter as the "first" and another as the "second", that's all we need to have the ability to "compare" constructors. 简而言之,构造函数“相等”归结为“位置相等”,它总是在lambda演算中定义,并且如果我们可以将一个参数区分为“第一个”,将另一个参数区分为“第二个”,那就是我们所需要的“比较”构造函数的能力。

First of all, match and match3 are actually exactly the same (ignoring the different strings): pattern matching in functions is desugared to a case statement. 首先, matchmatch3实际上是完全相同的(忽略不同的字符串):函数中的模式匹配被视为case语句。

Next, pattern matching works on constructors rather than arbitrary values. 接下来,模式匹配在构造函数上而不是在任意值上起作用。 Pattern matching is built into the language and does not depend on boolean expressions--it just acts on the constructors directly. 模式匹配内置于语言中,并且不依赖于布尔表达式-它仅直接作用于构造函数。 This is most evident with more complex matches that include some matchable terms: 这在包含一些可匹配术语的更复杂的匹配中最明显:

f :: MyType -> Int
f (A a) = a + 1
f (B a b) = a + b

How would you rewrite these patterns into boolean expressions? 您如何将这些模式重写为布尔表达式? You can't (without knowing anything else about MyType ). 您不能(不了解MyType其他信息)。

Instead, the pattern matching just goes by constructor. 相反,模式匹配仅由构造函数进行。 Each pattern has to be headed by a constructor--you can't write a pattern like f (abc) in Haskell. 每个模式都必须以构造函数为首-您不能在Haskell中编写类似f (abc)的模式。 Then, when the function gets a value, it can look at the value's constructor and jump to the appropriate cases immediately. 然后,当函数获取值时,它可以查看值的构造函数并立即跳转到适当的情况。 This is the way it has to work for more complicated patterns (like A a ), and is also the way it works for the simple patterns you used. 这是它必须用于更复杂的模式(例如A a )的方式,也是它用于所使用的简单模式的方式。

Since pattern matching just works by going to the appropriate constructor, it does not depend on Eq at all . 由于模式匹配只是要适当的构造函数的作品,它不依赖于Eq 可言 Not only do you not have to have an Eq instance to pattern match, but having one also does not change how pattern matching behaves. 您不仅不必拥有一个Eq实例来进行模式匹配,而且拥有一个实例也不会改变模式匹配的行为。 For example, try this: 例如,尝试以下操作:

data MyType = TypeA | TypeB | TypeC

instance Eq MyType where 
  TypeA == TypeA = True
  TypeB == TypeC = True
  TypeC == TypeB = True
  _ == _         = False

match :: MyType → String
match TypeA = "this is type A"
match TypeB = "this is type B"
match TypeC = "this is type C"

match2 :: MyType → String
match2 a | a == TypeA = "this is type A matched by equality"
         | a == TypeC = "this is type B matched by equality"
         | a == TypeB = "this is type C matched by equality"
         | otherwise = "this is neither type A nor type B"

Now you've defined equality such that TypeB is equal to TypeC but not to itself. 现在,您已经定义了相等性,以使TypeB等于TypeC但不等于自身。 (In real life, you should ensure that equality behaves reasonably and follows the reflexive property, but this is a toy example.) Now, if you use pattern matching, TypeB still matches TypeB and TypeC matches TypeC . (在现实生活中,您应确保相等行为合理并遵循自反属性,但这只是一个玩具示例。)现在,如果使用模式匹配,则TypeB仍会匹配TypeBTypeC匹配TypeC But if you use your guard expressions, TypeB actually matches TypeC and TypeC matches TypeB . 但是,如果使用保护表达式,则TypeB实际上匹配TypeCTypeC匹配TypeB TypeA is unchanged between the two. 两者之间的TypeA不变。

Moreover, note how the Eq instance was defined using pattern matching. 此外,请注意如何使用模式匹配来定义Eq实例。 When you use a deriving clause, it gets defined in a similar way with code generated at compile time. 使用deriving子句时,它的定义与在编译时生成的代码的定义类似。 So pattern matching is more fundamental than Eq --it is part of the language where Eq is just part of the standard library. 因此,模式匹配比Eq更基础-它是Eq只是标准库一部分的语言的一部分。

In summary: pattern matching is built into the language and works by comparing the constructor and then recursively matching on the rest of the value. 总结:语言中内置了模式匹配功能,它通过比较构造函数然后在其余值上进行递归匹配来工作。 Equality is usually implemented in terms of pattern matching and compares the whole value rather than just the constructor. 平等通常是通过模式匹配来实现的,它会比较整个值,而不只是比较构造函数。

The thing you are missing is that constructors in Haskell can have arguments. 您缺少的是Haskell中的构造函数可以具有参数。 The type tags "themselves" are comparable by equality (at least internally, behind the scenes), but the full values are only comparable if the constituent arguments are also comparable. 类型标签“自己”在相等性上是可比较的(至少在内部,在幕后),但是仅当组成参数也可比较时,完整值才可比较。

So if you have a type like 所以如果你有一个像

data Maybe a = Nothing | Just a

then even though you can test if a type tag is "Nothing" or "Just" (ie.; pattern match on the maybe value) in general you can't compare the full maybe unless the value of type "a" that is being held by the Just also happens to be comparable. 那么即使您通常可以测试类型标记是“ Nothing”还是“ Just”(即,可能值的模式匹配),也不能比较完整的也许,除非类型为“ a”的值被贾斯汀(Just)所持的股份也恰好具有可比性。

--note that your first and third examples are
--just syntactic sugar for each other...
matchMaybe mb = case mb of
    Nothing -> "Got a Nothing"
    Just _  -> "Got a Just but ignored its value"

It should now also be clear why its not possible to write a variation of match2 for Maybes. 现在也应该清楚为什么不能为Maybes编写match2的变体。 What would you use to test for equality in the Just case? 在Just案例中,您将使用什么来测试是否相等?

matchMaybe_2 mb | mb == Nothing = "Got a Nothing"
                | mb == Just {- ??? -} = "This case is impossible to write like this"

The way I think of it, pattern matching is basically bitwise equality. 我认为,模式匹配基本上是按位相等。 It's based on types, not some abstract notion of value. 它基于类型,而不是一些抽象的价值观念。

Keep in mind however, you should think of say Int as 不过请记住,您应该将Int视为

data Int = ... | -2 :: Int | -1 :: Int | 0 :: Int | 1 :: Int | 2 :: Int | ...

So in a way, each integer has a different type. 因此,在某种程度上,每个整数都有不同的类型。

That's why you can match against the Int say 2 . 这就是为什么您可以与Int 2进行匹配。

Eq goes a bit further, it allows you to set things to be equal that may not be bitwise the same thing. Eq更进一步,它使您可以将事物设置为相等,而这可能不是同一事物。

For example, you might have a binary tree that stores a sorted elements. 例如,您可能有一个存储有排序元素的二叉树。 Say the following: 说以下内容:

  A       A
 / \     / \
B   C   B   D
     \   \
      D   C

May be considered equal by Eq , because they contain the same elements, but you wouldn't be able to check for equality here using pattern matching. Eq可能会将其视为相等,因为它们包含相同的元素,但是您将无法在此处使用模式匹配检查是否相等。

But in the case of numbers, bitwise equality is basically the same as logical equality (except perhaps positive and negative floating point 0.0 ) so here Eq and pattern matching are pretty much equivalent. 但是对于数字,按位相等与逻辑相等(也许正负浮点0.0除外)基本相同,因此此处的Eq和模式匹配几乎等效。


For an analogy to C++, think of Eq as operator== and pattern matching as memcmp . 与C ++相似,将Eq视为operator==并将模式匹配视为memcmp You can compare a lot of types for equality simply using memcmp , but some you can't, if they can have different representations for "equal" values. 您可以简单地使用memcmp来比较许多类型的相等性,但是如果它们可以对“相等”值使用不同的表示形式,则无法进行比较。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM