仅接受sum类型的一个变量作为OCaml函数参数

Question

I have a large sum type originating in existing code. 我有一个源于现有代码的大量类型。 Let's say it looks like this: 让我们说它看起来像这样：

type some_type =
  | Variant1 of int
  | Variant2 of int * string

Although both Variant1 and Variant2 are used elsewhere, I have a specific function that only operates on Variant2 : 虽然这两个Variant1和Variant2在其他地方使用，我只上运行特定功能Variant2 ：

let print_the_string x =
  match x with
  | Variant2(a,s) -> print_string s; ()
  | _ -> raise (Failure "this will never happen"); ()

Since this helper function is only called from one other place, it is easy to show that it will always be called with an input of Variant2 , never with an input of Variant1 . 由于这个辅助功能只能从另外一个地方叫，很容易证明它总是与输入被称为Variant2 ，从来没有的输入Variant1 。

Let's say the call looks like this: 让我们说这个电话看起来像这样：

let () =
  print_the_string (Variant2(1, "hello\n"))

If Variant1 and Variant2 were separate types, I would expect OCaml to infer the type Variant2 -> () for print_the_string , however, since they are both variants of the same sum type, OCaml infers the signature some_type -> () . 如果Variant1和Variant2是不同的类型，我希望OCaml的推断型Variant2 -> ()为print_the_string ，但是，因为它们是相同的总和型的两种变体，OCaml的推断签名some_type -> ()

When I encounter a program that throws an exception with a message like "this will never happen," I usually assume the original programmer did something wrong. 当我遇到一个抛出异常的程序时，会发出一条消息，例如“这将永远不会发生”，我通常认为原来的程序员做错了什么。

The current solution works, but it means that a mistake in the program would be caught at runtime, not as a compiler error as would be preferable. 当前的解决方案有效，但这意味着程序中的错误将在运行时捕获，而不是作为编译器错误而不是更好。

Ideally, I'd like to be able to annotate the function like this: 理想情况下，我希望能够像这样注释函数：

let print_the_string (x : some_type.Variant2) =

But, of course, that's not allowed. 但是，当然，这是不允许的。

Question: Is there a way to cause a compiler error in any case where Variant1 was passed to print_the_string ? 问：有没有办法使编译器错误在任何情况下Variant1传递给print_the_string ？

A related question was asked here , but nlucarioni and Thomas's answers simply address cleaner ways to handle incorrect calls. 这里提出了一个相关的问题，但是nlucarioni和Thomas的答案只是解决了处理不正确呼叫的更简洁方法。 My goal is to have the program fail more obviously, not less. 我的目标是让程序更明显地失败，而不是更少。

Update: I'm accepting gallais's solution as, after playing with it, it seems like the cleanest way to implement something like this. 更新：我接受了加莱的解决方案，因为在玩完它后，它似乎是实现这样的最干净的方式。 Unfortunately, without a very messy wrapper, I don't believe any of the solutions work in the case where I cannot modify the original definition of some_type . 不幸的是，如果没有非常混乱的包装器，我不相信任何解决方案都适用于我无法修改some_type的原始定义的some_type 。

Answer 1

There is not enough information in your post to decide whether what follows could be useful for you. 您的帖子中没有足够的信息来决定以下内容是否对您有用。 This approach is based on propagating an invariant and will play nicely if your code is invariant-respecting. 这种方法基于传播一个不变量，并且如果你的代码是不变的，它将很好地发挥作用。 Basically, if you do not have functions of type some_type -> some_type which turn values using Variant2 as their head constructor into ones constructed using Variant1 then you should be fine with this approach. 基本上，如果你没有类型的函数some_type -> some_type那一转用值Variant2作为他们的头部构造成利用人工那些Variant1那么你应该罚款这种方法。 Otherwise it gets pretty annoying pretty quickly. 否则很快就会变得很烦人。

Here we are going to encode the invariant "is built using Variant2 " into the type by using phantom types and defining some_type as a GADT . 在这里，我们将通过使用幻像类型并将some_type定义为GADT来将不变“使用Variant2 ”编码到类型中。 We start by declaring types whose sole purpose is to play the role of tags. 我们首先声明其唯一目的是扮演标签角色的类型。

type variant2
type variantNot2

Now, we can use these types to record which constructor was used to produce a value of some_type . 现在，我们可以使用这些类型来记录用于生成some_type值的some_type 。 This is the GADT syntax in Ocaml; 这是Ocaml中的GADT语法; it's just slightly different from the ADT one in the sense that we can declare what the return type of a constructor is and different constructors can have different return types. 它与ADT略有不同，因为我们可以声明构造函数的返回类型是什么，不同的构造函数可以有不同的返回类型。

type _ some_type =
  | Variant1 : int          -> variantNot2 some_type
  | Variant2 : int * string -> variant2    some_type

One could also throw in a couple of extra constructors as long as their signature records the fact their are not Variant2 . 只要他们的签名记录了他们不是Variant2的事实，人们也可以投入一些额外的构造函数。 I won't deal with them henceforth but you can try to extend the definitions given below so that they'll work well with these extra constructors. 我今后不会处理它们，但你可以尝试扩展下面给出的定义，以便它们能够很好地处理这些额外的构造函数。 You can even add a print_the_second_int which will only take Variant3 and Variant4 as inputs to check that you get the idea behind this. 你甚至可以添加一个print_the_second_int ，它只Variant3和Variant4作为输入来检查你是否明白这一点。

  | Variant3 : int * int    -> variantNot2 some_type
  | Variant4 : float * int  -> variantNot2 some_type

Now, the type of print_the_string can be extremely precise: we are only interested in elements of some_type which have been built using the constructor Variant2 . 现在，类型print_the_string可以极为精确：我们只在元素感兴趣some_type已使用构造内置Variant2 。 In other words, the input of print_the_string should have type variant2 some_type . 换句话说， print_the_string的输入应该具有variant2 some_type类型。 And the compiler can check statically that Variant2 is the only constructor possible for values of that type. 并且编译器可以静态地检查Variant2是该类型的值的唯一可能的构造函数。

let print_the_string (x : variant2 some_type) : unit =
  match x with Variant2 (_, s) -> print_string s

Ok. 好。 But what if we have a value of type 'a some_type because it was handed over to us by a client; 但是如果我们有一个'a some_type类型的值，因为它是由客户交给我们的呢？ we built it tossing a coin; 我们建造它扔硬币; etc.? 等等。？ Well, there's no magic there: if you want to use print_the_string , you need to make sure that this value has been built using a Variant2 constructor. 好吧，那里没有魔力：如果你想使用print_the_string ，你需要确保使用Variant2构造函数构建了这个值。 You can either try to cast the value to a variant2 some_type one (but this may fail, hence the use of the option type): 您可以尝试将值转换为variant2 some_type one（但这可能会失败，因此使用option类型）：

let fromVariant2 : type a. a some_type -> (variant2 some_type) option = function
  | Variant2 _ as x -> Some x
  | Variant1 _      -> None

Or (even better!) decide in which realm the value lives: 或者（甚至更好！）决定价值在哪个领域：

type ('a, 'b) either = | Left  of 'a | Right of 'b

let em : type a. a some_type -> (variant2 some_type, variantNot2 some_type) either =
   fun x -> match x with
   | Variant1 _ -> Right x
   | Variant2 _ -> Left x

Answer 2

我的解决方案是使用print_the_string : int * string -> unit ，因为Variant2部分没有提供任何信息，你应该放弃它。

Answer 3

The type inference works toward inferring types (obviously) not values of types. 类型推断适用于推断类型（显然）不是类型的值。 But you can do what you propose with polymorphic variants. 但是你可以用多态变体做你提出的建议。 Although, I agree with Thomash. 虽然，我同意Thomash。

 type v1 = [ `Variant1 of int ]
 type v2 = [ `Variant2 of int * string ]

 let print_the_string (`Variant1 x) = ()

Answer 4

Gallais provided an excellent, but long answer, so I've decided to add a shorter version. 加拉提供了一个很好但很长的答案，所以我决定添加一个更短的版本。

If you have a variant type and would like to add functions that works only on a subset of variants, then you can use GADTS. 如果您有变体类型并且想要添加仅适用于变体子集的函数，那么您可以使用GADTS。 Consider the example: 考虑这个例子：

open Core.Std

type _ t =
  | Int: int -> int t
  | Str: string -> string t

let str s = Str s

let uppercase (Str s) = Str (String.uppercase s)

Function uppercase has type string t -> string t and accepts only string version of type t , so you can deconstruct the variant just in place. 函数uppercase类型为string t -> string t ，只接受类型为t字符串版本，因此您可以在适当的位置解构变量。 Function str has type string -> string t , so that the return type carries in itself an information (a witness type) that the only possible variant, produced from this function is Str . 函数str具有类型string -> string t ，因此返回类型本身带有一个信息（见证类型），该函数生成的唯一可能的变量是Str 。 So when you have a value that has such type, you can easily deconstruct it without using explicit pattern-matching, since it becomes irrefutable, ie, it can't fail. 因此，当您拥有具有此类型的值时，您可以轻松地解构它而不使用显式模式匹配，因为它变得无可辩驳，即它不会失败。

仅接受sum类型的一个变量作为OCaml函数参数

问题描述

4 个解决方案

解决方案1
6 已采纳 2014-07-10 01:26:17

解决方案2
4 2014-07-09 12:20:59

解决方案3
3 2014-07-09 13:32:27

解决方案4
1 2014-07-13 07:43:38

仅接受sum类型的一个变量作为OCaml函数参数

问题描述

4 个解决方案

解决方案1 6 已采纳 2014-07-10 01:26:17

解决方案2 4 2014-07-09 12:20:59

解决方案3 3 2014-07-09 13:32:27

解决方案4 1 2014-07-13 07:43:38

解决方案1
6 已采纳 2014-07-10 01:26:17

解决方案2
4 2014-07-09 12:20:59

解决方案3
3 2014-07-09 13:32:27

解决方案4
1 2014-07-13 07:43:38