简体   繁体   English

常规的haskell代数数据类型是否等同于上下文无关语法? GADTS怎么样?

[英]Are regular haskell algebraic data types equivalent to context free grammars? What about GADTS?

The syntax for algebraic data types is very similar to the syntax of Backus–Naur Form , which is used to describe context-free grammars. 代数数据类型的语法与Backus-Naur Form的语法非常相似,后者用于描述无上下文语法。 That got me thinking, if we think of the Haskell type checker as a parser for a language, represented as an algebraic data type (nularry type constructors representing the terminal symbols, for example), is the set of all languages accepted the same as the set of context free languages? 这让我想到,如果我们将Haskell类型检查器看作语言的解析器,表示为代数数据类型(例如,表示终端符号的nularry类型构造函数),则接受的所有语言的集合与一组上下文免费语言? Also, with this interpretation, what set of formal languages can GADTs accept? 另外,通过这种解释,GADT可以接受哪些正式语言?

First of all, data types do not always describe a set of strings (ie, a language). 首先,数据类型并不总是描述一组字符串(即语言)。 That is, while a list type does, a tree type does not. 也就是说,虽然列表类型的确如此,但树型不会。 One might counter that we could "flatten" the trees into lists and think of that as their language. 有人可能反驳说,我们可以将树木“压扁”成列表,并将其视为他们的语言。 Yet, what about data types like 然而,像数据类型呢

data F = F Int (Int -> Int)

or, worse 或者更糟

data R = R (R -> Int)

?

Polynomial types (types without -> inside) roughly describe trees, which can be flattened (in-order visited), so let's use those as an example. 多项式类型(没有->内部的类型)粗略地描述了可以展平的树(按顺序访问),所以让我们以这些为例。

As you have observed, writing a CFG as a (polynomial) type is easy, since you can exploit recursion 正如您所观察到的,将CFG写为(多项式)类型很容易,因为您可以利用递归

data A = A1 Int A | A2 Int B
data B = B1 Int B Char | B2

above A expresses { Int^m Char^n | m>n } 以上A表示{ Int^m Char^n | m>n } { Int^m Char^n | m>n } . { Int^m Char^n | m>n }

GADTs go much beyond context-free languages. GADT远远超出了无语境的语言。

data Z
data S n 

data ListN a n where
  L1 :: ListN a Z
  L2 :: a -> ListN a n -> ListN a (S n)

data A
data B
data C

data ABC where
   ABC :: ListN A n -> ListN B n -> ListN C n -> ABC

above ABC expresses the (flattened) language A^n B^n C^n , which is not context-free. ABC上面表达了(扁平化的)语言A^n B^n C^n ,它不是无上下文的。

You are pretty much unrestricted with GADTs, since it's easy to encode arithmetics with them. 你几乎不受GADT限制,因为用它们编码算术很容易。 That is you can build a type Plus abc which is non-empty iff c=a+b with Peano naturals. 也就是说,你可以建立一个类型Plus abc这是一个非空当且仅当c=a+b与皮亚诺土黄。 You can also build a type Halt nm which is non-empty iff the Turing machine m halts on input m . 如果图灵机m在输入m上停止,你也可以构建一个非空的类型Halt nm So, you can build a language 所以,你可以建立一种语言

{ A^n B^m proof | n halts on m , and proof proves it }

which is recursive (and not in any simpler class, roughly). 这是递归的(大概不是在任何更简单的类中)。

At the moment, I do not know whether you can describe recursively enumerable (computably enumerable) languages in GADTs. 目前,我不知道您是否可以在GADT中描述递归可枚举(可计算可枚举)的语言。 Even in the halting problem example, I have to include the "proof" term inside the GADT to make it work. 即使在停止问题的例子中,我也必须在GADT中包含“证明”术语以使其有效。

Intuitively, if you have a string of length n and you want to check it a against a GADT, you can build all the GADT terms of depth n , flatten them, and then compare to the string. 直观地说,如果你有一个长度为n的字符串并且想要针对GADT检查它,你可以构建深度为n所有GADT项,展平它们,然后与字符串进行比较。 This should prove that such language is always recursive. 这应该证明这种语言总是递归的。 However, existential types make this tree building approach quite tricky, so I do not have a definite answer right now. 但是,存在类型使得这种树构建方法相当棘手,所以我现在还没有明确的答案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM