简体   繁体   English

Haskell 模式匹配 - 它是什么?

[英]Haskell pattern matching - what is it?

What is pattern matching in Haskell and how is it related to guarded equations? Haskell 中的模式匹配是什么,它与受保护的方程有什么关系?

I've tried looking for a simple explanation, but I haven't found one.我试图寻找一个简单的解释,但我还没有找到。

EDIT: Someone tagged as homework.编辑:有人标记为家庭作业。 I don't go to school anymore, I'm just learning Haskell and I'm trying to understand this concept.我不再上学了,我只是在学习 Haskell,我正在努力理解这个概念。 Pure out of interest.纯粹出于兴趣。

In a nutshell, patterns are like defining piecewise functions in math.简而言之,模式就像在数学中定义分段函数。 You can specify different function bodies for different arguments using patterns.您可以使用模式为不同的参数指定不同的函数体。 When you call a function, the appropriate body is chosen by comparing the actual arguments with the various argument patterns.当您调用函数时,通过将实际参数与各种参数模式进行比较来选择适当的主体。 Read A Gentle Introduction to Haskell for more information.阅读Haskell 简介以获取更多信息。

Compare:相比:

斐波那契数列

with the equivalent Haskell:使用等效的 Haskell:

fib 0 = 1
fib 1 = 1
fib n | n >= 2 
      = fib (n-1) + fib (n-2)

Note the " n ≥ 2" in the piecewise function becomes a guard in the Haskell version, but the other two conditions are simply patterns.注意分段函数中的“ n ≥ 2”在 Haskell 版本中变成了守卫,但其他两个条件只是模式。 Patterns are conditions that test values and structure, such as x:xs , (x, y, z) , or Just x .模式是测试值和结构的条件,例如x:xs(x, y, z)Just x In a piecewise definition, conditions based on = or relations (basically, the conditions that say something "is" something else) become patterns.在分段定义中,基于=关系的条件(基本上,说某物“是”某物的条件)成为模式。 Guards allow for more general conditions.警卫允许更一般的条件。 We could rewrite fib to use guards:我们可以重写fib来使用守卫:

fib n | n == 0 = 1
      | n == 1 = 1
      | n >= 2 = fib (n-1) + fib (n-2)

There are other good answers, so I'm going to give you a very technical answer.还有其他很好的答案,所以我会给你一个非常技术性的答案。 Pattern matching is the elimination construct for algebraic data types :模式匹配是代数数据类型消除构造

  • "Elimination construct" means "how to consume or use a value" “消除构造”的意思是“如何消费或使用一个值”

  • "Algebraic data type", in addition to first-class functions, is the big idea in a statically typed functional language like Clean, F#, Haskell, or ML除了一流的函数之外,“代数数据类型”是静态类型函数语言(如 Clean、F#、Haskell 或 ML)中的重要思想

The idea of algebraic data types is that you define a type of thing, and you say all the ways you can make that thing.代数数据类型的想法是你定义一种事物,并说出你可以制造那种事物的所有方法。 As an example, let's define "Sequence of String" as an algebraic data type, with three ways to make it:例如,让我们将“字符串序列”定义为代数数据类型,有以下三种方法:

data StringSeq = Empty                    -- the empty sequence
               | Cat StringSeq StringSeq  -- two sequences in succession
               | Single String            -- a sequence holding a single element

Now, there are all sorts of things wrong with this definition, but as an example it's interesting because it provides constant-time concatenation of sequences of arbitrary length.现在,这个定义有各种各样的错误,但作为一个例子,它很有趣,因为它提供了任意长度序列的恒定时间串联。 (There are other ways to achieve this.) The declaration introduces Empty , Cat , and Single , which are all the ways there are of making sequences . (还有其他方法可以实现这一点。)该声明引入了EmptyCatSingle ,它们是创建序列的所有方法。 (That makes each one an introduction construct—a way to make things.) (这使每个人都成为介绍结构——一种创造事物的方式。)

  • You can make an empty sequence without any other values.您可以创建一个没有任何其他值的空序列。
  • To make a sequence with Cat , you need two other sequences.要使用Cat制作序列,您需要另外两个序列。
  • To make a sequence with Single , you need an element (in this case a string)要使用Single制作序列,您需要一个元素(在本例中为字符串)

Here comes the punch line: the elimination construct, pattern matching, gives you a way to scrutinize a sequence and ask it the question what constructor were you made with?重点来了:消除构造,模式匹配,给了你一种方法来仔细检查一个序列并问它你是用什么构造函数制作的? . . Because you have to be prepared for any answer, you provide at least one alternative for each constructor.因为您必须为任何答案做好准备,所以您至少为每个构造函数提供一个替代方案。 Here's a length function:这是一个长度函数:

slen :: StringSeq -> Int
slen s = case s of Empty -> 0
                   Cat s s' -> slen s + slen s'
                   Single _ -> 1

At the core of the language, all pattern matching is built on this case construct.在语言的核心,所有模式匹配都建立在这个case结构上。 However, because algebraic data types and pattern matching are so important to the idioms of the language, there's special "syntactic sugar" for doing pattern matching in the declaration form of a function definition:然而,因为代数数据类型和模式匹配对于语言的习语非常重要,所以在函数定义的声明形式中进行模式匹配有特殊的“语法糖”:

slen Empty = 0
slen (Cat s s') = slen s + slen s'
slen (Single _) = 1

With this syntactic sugar, computation by pattern matching looks a lot like definition by equations.有了这个语法糖,模式匹配的计算看起来很像方程的定义。 (The Haskell committee did this on purpose.) And as you can see in the other answers, it is possible to specialize either an equation or an alternative in a case expression by slapping a guard on it. (Haskell 委员会是故意这样做的。)正如您在其他答案中看到的那样,可以通过对它进行保护来专门化一个方程或一个case表达式中的替代项。 I can't think of a plausible guard for the sequence example, and there are plenty of examples in the other answers, so I'll leave it there.我想不出序列示例的合理保护措施,其他答案中有很多示例,因此我将其留在那里。

Pattern matching is, at least in Haskell, deeply tied to the concept of algebraic data types .模式匹配,至少在 Haskell 中,与代数数据类型的概念密切相关。 When you declare a data type like this:当您声明这样的数据类型时:

data SomeData = Foo Int Int
              | Bar String
              | Baz

...it defines Foo , Bar , and Baz as constructors --not to be confused with "constructors" in OOP--that construct a SomeData value out of other values. ...它将FooBarBaz定义为构造函数——不要与 OOP 中的“构造函数”混淆——它们从其他值构造一个SomeData值。

Pattern matching is nothing more than doing this in reverse --a pattern would "deconstruct" a SomeData value into its constituent pieces (in fact, I believe that pattern matching is the only way to extract values in Haskell).模式匹配无非是反向执行此操作——模式会将SomeData值“解构”为其组成部分(事实上,我相信模式匹配是在 Haskell 中提取值的唯一方法)。

When there are multiple constructors for a type, you write multiple versions of a function for each pattern, with the correct one being selected depending on which constructor was used (assuming you've written patterns to match all possible constructions--which it's generally good practice to do).当一个类型有多个构造函数时,你为每个模式编写多个版本的函数,根据使用的构造函数选择正确的一个(假设你已经编写了匹配所有可能的构造的模式——这通常是好的练习做)。

In a functional language, pattern matching involves checking an argument against different forms.在函数式语言中,模式匹配涉及针对不同形式检查参数。 A simple example involves recursively defined operations on lists.一个简单的例子涉及对列表递归定义的操作。 I will use OCaml to explain pattern matching since it's my functional language of choice, but the concepts are the same in F# and Haskell, AFAIK.我将使用 OCaml 来解释模式匹配,因为它是我选择的函数式语言,但 F# 和 Haskell 中的概念是相同的,AFAIK。

Here is the definition of a function to compute the length of a list lst .这是计算列表lst长度的函数的定义。 In OCaml, an ``a list is defined recursively as the empty list [] , or the structure h::t , where h is an element of type a ( a being any type we want, such as an integer or even another list), t is a list (hence the recursive definition), and ::` is the cons operator, which creates a new list out of an element and a list.在 OCaml 中,``a 列表is defined recursively as the empty list [] , or the structure h::t , where h is an element of type ( a being any type we want, such as an integer or even another list), t is a list (hence the recursive definition), and ::` 是 cons 运算符,它从一个元素和一个列表中创建一个新列表。

So the function would look like this:所以函数看起来像这样:

let rec len lst =
  match lst with
    [] -> 0
  | h :: t -> 1 + len t

rec is a modifier that tells OCaml that a function will call itself recursively. rec是一个修饰符,它告诉 OCaml 一个函数将递归调用自己。 Don't worry about that part.不要担心那部分。 The match statement is what we're focusing on. match语句是我们关注的内容。 OCaml will check lst against the two patterns - empty list, or h :: t - and return a different value based on that. OCaml 将根据两种模式检查lst - 空列表或h :: t - 并基于此返回不同的值。 Since we know every list will match one of these patterns, we can rest assured that our function will return safely.由于我们知道每个列表都将匹配这些模式之一,因此我们可以放心,我们的函数将安全返回。

Note that even though these two patterns will take care of all lists, you aren't limited to them.请注意,尽管这两种模式将处理所有列表,但您并不仅限于它们。 A pattern like h1 :: h2 :: t (matching all lists of length 2 or more) is also valid.h1 :: h2 :: t (匹配所有长度为 2 或更多的列表)这样的模式也是有效的。

Of course, the use of patterns isn't restricted to recursively defined data structures, or recursive functions.当然,模式的使用不限于递归定义的数据结构或递归函数。 Here is a (contrived) function to tell you whether a number is 1 or 2:这是一个(人为的)函数来告诉你一个数字是 1 还是 2:

let is_one_or_two num =
  match num with
    1 -> true
  | 2 -> true
  | _ -> false

In this case, the forms of our pattern are the numbers themselves.在这种情况下,我们模式的形式就是数字本身。 _ is a special catch-all used as a default case, in case none of the above patterns match. _是一个特殊的包罗万象,用作默认情况,以防上述模式都不匹配。

Pattern matching is one of those painful operations that is hard to get one's head around if you come from procedural programming background.模式匹配是那些痛苦的操作之一,如果您具有过程编程背景,则很难理解这些操作。 I find it hard to get into because the same syntax used to create a data structure can be used for matching.我发现很难进入,因为用于创建数据结构的相同语法可用于匹配。

In F# you can use the cons operator :: to add an element to the beginning of a list like so:在 F# 中,您可以使用 cons 运算符::将元素添加到列表的开头,如下所示:

let a = 1 :: [2;3]
//val a : int list = [1; 2; 3]

Similarly you can use the same operator to split the list up like so:同样,您可以使用相同的运算符来拆分列表,如下所示:

let a = [1;2;3];;
match a with
    | [a;b] -> printfn "List contains 2 elements" //will match a list with 2 elements
    | a::tail -> printfn "Head is %d" a //will match a list with 2 or more elements
    | [] -> printfn "List is empty" //will match an empty list

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM