简体   繁体   English

什么是行类型? 它们是代数数据类型吗?

[英]What are row types? Are they algebraic data types?

I often hear that F# lacks support for OCaml row types, that makes the language more powerful than F#. 我经常听说F#缺乏对OCaml行类型的支持,这使得语言比F#更强大。

What are they? 这些是什么? Are they algebraic data types, such as sum types (discriminated unions) or product types (tuples, records)? 它们是代数数据类型,例如总和类型(有区别的联合)还是产品类型(元组,记录)? And is it possible to write row types in other dialects, such as F#? 是否可以在其他方言中写行类型,例如F#?

First of all, we need to fix the terminology. 首先,我们需要修复术语。 There is no such thing as "row type" , at least in type theory and especially in the type system of OCaml. 至少在类型理论中,特别是在OCaml的类型系统中,没有“行类型” 这样的东西。 There exists "row polymorphism" and we will discuss it below 0 . 存在“行多态”,我们将在0以下讨论它。

Row polymorphism is a form of polymorphism. 行多态性是多态的一种形式。 OCaml provides two kinds of polymorphism - parametric and row, and lacks the other two - ad hoc and inclusion (aka subtyping) 1 . OCaml提供两种多态 - 参数和行,而缺少其他两种 - ad hoc和包含(也称为子类型) 1

First of all, what is polymoprhism ? 首先,什么是多 In the context of type systems, polymorphism allows a single term to have several types. 在类型系统的上下文中,多态性允许单个术语具有多种类型。 The problem here is that the word type itself is heavily overloaded in the computer science and programming language community. 这里的问题是单词类型本身在计算机科学和编程语言社区中严重超载。 So to minimize the confusion, let's just reintroduce it here, to be on the same page 2 . 因此,为了最大限度地减少混淆,让我们在这里重新介绍它,在同一页2上 A type of a term usually denotes some approximation of the term semantics. 一种术语通常表示术语语义的某种近似。 Where semantics could be as simple as a set of values equipped with a set of operations or something more complex, like effects, annotations, and arbitrary theories. 语义可以像一组配有一组操作或更复杂的值(如效果,注释和任意理论)一样简单。 In general, semantics denotes a set of all possible behaviors of a term. 通常,语义表示一个术语的所有可能行为的集合。 A type system denotes a set of rules, that allows some language constructs and disallows others based on their types. 类型系统表示一组规则,允许某些语言构造,并根据其类型禁止其他语言。 Ie, it verifies that compositions of terms behave correctly. 即,它验证术语的组合行为是否正确。 For example, if there is a function application construct in a language the type system will allow an application only to those arguments that have types that match with the types of parameters. 例如,如果某种语言中存在函数应用程序构造,则类型系统将仅允许应用程序使用具有与参数类型匹配的类型的参数。 And that's where polymorphism comes into play. 这就是多态性发挥作用的地方。 In monomorphic type systems, this match could be only one to one, ie, literal. 在单形类型系统中,这种匹配可能只是一对一,即文字。 Polymorphic type systems provide mechanisms to specify some regular expression that will match with a family of types. 多态类型系统提供了指定一些与一族类型匹配的正则表达式的机制。 So, different kinds of polymorphism are simply different kinds of regular expressions that you may use to denote the family of types. 因此,不同类型的多态性只是不同类型的正则表达式,您可以使用它们来表示类型族。

Now let's look at different kinds of polymorphism from this perspective. 现在让我们从这个角度来看待不同类型的多态。 For example, parametric polymorphism is like a dot in regular expressions. 例如,参数多态就像正则表达式中的点。 Eg, 'a list is . list 例如, 'a list. list . list - that means we match literally with list and a parameter of the list type could be any type. . list - 表示我们按字面顺序与list匹配, list类型的参数可以是任何类型。 The row polymorphism is a star operator, eg, <quacks : unit; ..> 行多态性是一个星型运算符,例如<quacks : unit; ..> <quacks : unit; ..> is the same as <quacks : unit; .*> <quacks : unit; ..><quacks : unit; .*> <quacks : unit; .*> . <quacks : unit; .*> And it means that it matches with any type that quacks and does whatever else 3 . 这意味着它与任何类型的quacks和其他任何类型3相匹配。 Speaking of nominal subtyping, in this case, we have nominal classes (aka character classes in regexp), and we specify a family of types with the name of their base class. 说到名义上的子类型,在这种情况下,我们有名义类(在regexp中也称为字符类),并且我们指定了一系列具有其基类名称的类型。 Eg, duck is like [:duck:]* and any value that is properly registered as a member of class matches with this type (via class inheritance and the new operator) 4 . 例如, duck就像[:duck:]* ,任何正确注册为类成员的值都与此类型匹配(通过类继承和new运算符) 4 Finally, ad-hoc polymorphism is in fact also nominal and maps to character classes in regular expressions. 最后,ad-hoc多态实际上也是名义上的,并映射到正则表达式中的字符类。 The main difference here is that the notion of type in ad-hoc polymorphism is applied not to a value, but rather to the name. 这里的主要区别在于ad-hoc多态的类型概念不是应用于值,而是应用于名称。 So a name, like a function name or the + operator, may have multiple definitions (implementations) that should be statically registered using some language mechanism (eg, overloading an operator, implementing a method, etc). 因此,名称(如函数名称或+运算符)可能具有多个定义(实现),这些定义应该使用某种语言机制静态注册(例如,重载运算符,实现方法等)。 So, ad-hoc polymorphism is just a special case of nominal subtyping. 因此,ad-hoc多态只是名义子类型的特例。

Now, when we are clear, we can discuss what row polymorphism gives us. 现在,当我们清楚时,我们可以讨论行多态性给我们带来什么。 Row polymorphism is a feature of structural type systems (also known as duck typing in dynamically typed languages) as contrasted to nominal type systems, which provide subtyping polymorphism. 行多态性是结构类型系统的一个特征(也称为动态类型语言中的鸭子类型),与提供子类型多态性的名义类型系统形成对比。 In general, as we discussed above, it allows us to specify, a type as "anything that quacks" as opposed to "anything that implements the IDuck interface". 一般来说,正如我们上面所讨论的,它允许我们指定一种类型为“任何嘎嘎叫”而不是“任何实现IDuck接口的东西”。 So yes, you can, of course, do the same with the nominal typing by defining the duck interface and explicitly registering all implementations as instances of this interface using some inherit or implements mechanisms. 当然,你可以通过定义duck接口并使用一些inheritimplements机制显式地将所有实现注册为此接口的实例来对标称类型执行相同的操作。 But the main problem here is that your hierarchy is sealed, ie, you need to change your code to register an implementation in a newly created interface. 但这里的主要问题是您的层次结构是密封的,即您需要更改代码以在新创建的接口中注册实现。 That breaks the open/closed principle and hampers code reuse. 这打破了开放/封闭原则并阻碍了代码重用。 Another problem with the nominal subtyping is that unless your hierarchy forms a lattice (ie, for any two classes there is always a least upper bound) you can't implement type inference on it 5 . 名义子类型的另一个问题是,除非你的层次结构形成一个格子(即,对于任何两个类总是有一个最小上限),你不能在它上面实现类型推断5

Further Reading 进一步阅读

  • Objective ML: An effective object-oriented extension to ML - a comprehensive description of the topic. 目标ML:ML的有效面向对象扩展 - 对该主题的全面描述。

  • François Pottier and Didier Rémy. FrançoisPottier和DidierRémy。 The Essence of ML Type Inference. ML类型推理的本质。 In Benjamin C. Pierce, editor, Advanced Topics in Types and Programming Languages, MIT Press, 2005. - See section 10.8 for a very thorough and detailed explanation of rows. 在Benjamin C. Pierce,编辑,类型和编程语言高级主题,麻省理工学院出版社,2005年。 - 有关行的详细解释,请参见第10.8节。

  • Simple Type Inference for Structural Polymorphism - for a detailed explanation of the interaction between structural and row polymorphism in the presence of type inference. 结构多态性的简单类型推断 - 详细解释在存在类型推断时结构和行多态之间的相互作用。


0) As was pointed in comments by @nekketsuuu, I was using the terminology a little bit voluntaristic, as my intention was to give an easy to understand and high-level idea, without going deep into details. 0)正如@nekketsuuu在评论中指出的那样,我使用的术语有点自愿,因为我的目的是提供一个易于理解和高层次的想法,而不是深入细节。 I've revised the post since then, to make it a little bit more strict. 从那以后我修改了帖子,使其更加严格。

1) Yet OCaml provides classes with inheritance and a notion of subtype, it still not a subtyping polymorphism according to the common definition, as it's not nominal. 1)然而,OCaml提供了具有继承和子类型概念的类,它仍然不是根据通用定义的子类型多态性,因为它不是名义上的。 It should come more clear from the rest of the answer. 从答案的其余部分可以更清楚地看出来。

2) I'm just fixing the terminology, I'm not claiming that my definition is right. 2)我只是在修理术语,我并不是说我的定义是正确的。 Many people think that type denotes a representation of a value, and historically this is correct. 许多人认为类型表示值的表示,并且历史上这是正确的。

3) Perhaps a better regexp would be <.*; quacks : unit; .*> 3)也许更好的正则表达式是<.*; quacks : unit; .*> <.*; quacks : unit; .*> <.*; quacks : unit; .*> but I think you got the idea. <.*; quacks : unit; .*>但我认为你明白了。

4) Thus OCaml doesn't have subtyping polymorphism, although it has a notion of subtype. 4)因此OCaml没有子类型多态性,尽管它有一个子类型的概念。 When you specify a type it will not match with the subtype, it will only match literally, and you need to use an explicit upcasting operator to make a value of type T to be applicable in a context where super(T) is expected. 当您指定一个与子类型不匹配的类型时,它只会按字面意思匹配,并且您需要使用显式向上转换运算符来使类型T的值适用于需要super(T)的上下文中。 So although there is subtyping in OCaml it is not about polymorphism. 因此,尽管OCaml中存在子类型,但它与多态性无关。

5) And although the lattice requirement doesn't look impossible, it is hard in real life to impose this restriction on hierarchies, or if it is imposed the precision of the type inference will be always bound with the precision of the type hierarchy. 5)虽然格子要求看起来不可能,但在现实生活中很难对层次结构施加这种限制,或者如果强加,则类型推断的精度将始终与类型层次结构的精度相关联。 So in practice, it doesn't work, cf. 所以在实践中,它不起作用,参见 Scala 斯卡拉

(skip this note on a first read) Though in OCaml there exist row variables that are used to embed row polymorphism into OCaml type inference that has only parametric polymorphism. (在第一次阅读时跳过此注释)虽然在OCaml中存在行变量 ,用于将行多态嵌入到仅具有参数多态的OCaml类型推断中。

‡) Often the word typing is used interchangeably with the type system to refer to a particular set of rules in the overall type system. ‡)通常,单词输入与类型系统可互换使用,以指代整个类型系统中的一组特定规则。 For example, sometimes we say "OCaml has row typing" to denote the fact, that the OCaml type system provides rules for "row polymorphism". 例如,有时我们说“OCaml有行键入”来表示OCaml类型系统为“行多态”提供规则的事实。

Row types are weird. 行类型很奇怪。 And very powerful. 而且非常强大。

Row types are used to implement objects and polymorphic variants in OCaml. 行类型用于在OCaml中实现对象和多态变体。

But first, here's what we cannot do without row types: 但首先,这是没有行类型我们不能做的事情:

type t1 = { a : int; b : string; }
type t2 = { a : int; c : bool; }

let print_a x = print_int x.a

let ab = { a = 42; b = "foo"; }
let ac = { a = 123; c = false; }

let () =
 print_a ab;
 print_a ac

This code will of course refuse to compile, because print_a must have a unique type: either t1 , or t2 , but not both. 这段代码当然会拒绝编译,因为print_a必须有一个唯一的类型: t1t2 ,但不是两者。 However, in some cases, we may want that exact behavior. 但是,在某些情况下,我们可能需要这种确切的行为。 That's what row types are for. 这就是行类型的用途。 That's what they do: a more "flexible" type. 这就是他们所做的:更“灵活”的类型。

In OCaml, there are two main uses of row types: objects and polymorphic variants . 在OCaml中,行类型有两个主要用途: 对象多态变体 In terms of algebra, objects give you "row product" and polymorphic variants "row sum". 在代数方面,对象为您提供“行产品”和多态变体“行和”。

What's to note about row types is that you can end up with some subtyping to declare, and very counter intuitive typing and semantics (notably in the case classes). 关于行类型的注意事项是,您最终可以使用一些子类型进行声明,并且非常反直观的类型和语义(特别是在案例类中)。

You can check this paper for more details. 您可以查看本文以获取更多详细信息。

I'll complete PatJ's excellent answer with his example, written using classes. 我将用他的例子完成PatJ的优秀答案 ,用类编写。

Given the classes below: 鉴于以下类别:

class t1 = object
  method a = 42
  method b = "Hello world"
end

class t2 = object
  method a = 1337
  method b = false
end

And the objects below: 以下对象:

let o1 = new t1
let o2 = new t2

You can write the following: 您可以编写以下内容:

let print_a t = print_int t#a;;
val print_a : < a : int; .. > -> unit = <fun>

print_a o1;;

42
- : unit = ()

print_a o2;;

1337
- : unit = ()

You can see the row type in print_a 's signature. 您可以在print_a的签名中看到行类型。 The < a : int; .. > < a : int; .. > < a : int; .. > is a type that literally means "any object that has at least a method a with signature int " . < a : int; .. >是一种字面意思是“任何至少a带有签名int的方法a对象”的类型

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM