[英]Accepting only one variant of sum type as OCaml function parameter
I have a large sum type originating in existing code. 我有一个源于现有代码的大量类型。 Let's say it looks like this: 让我们说它看起来像这样:
type some_type =
| Variant1 of int
| Variant2 of int * string
Although both Variant1
and Variant2
are used elsewhere, I have a specific function that only operates on Variant2
: 虽然这两个Variant1
和Variant2
在其他地方使用,我只上运行特定功能Variant2
:
let print_the_string x =
match x with
| Variant2(a,s) -> print_string s; ()
| _ -> raise (Failure "this will never happen"); ()
Since this helper function is only called from one other place, it is easy to show that it will always be called with an input of Variant2
, never with an input of Variant1
. 由于这个辅助功能只能从另外一个地方叫,很容易证明它总是与输入被称为Variant2
,从来没有的输入Variant1
。
Let's say the call looks like this: 让我们说这个电话看起来像这样:
let () =
print_the_string (Variant2(1, "hello\n"))
If Variant1
and Variant2
were separate types, I would expect OCaml to infer the type Variant2 -> ()
for print_the_string
, however, since they are both variants of the same sum type, OCaml infers the signature some_type -> ()
. 如果Variant1
和Variant2
是不同的类型,我希望OCaml的推断型Variant2 -> ()
为print_the_string
,但是,因为它们是相同的总和型的两种变体,OCaml的推断签名some_type -> ()
When I encounter a program that throws an exception with a message like "this will never happen," I usually assume the original programmer did something wrong. 当我遇到一个抛出异常的程序时,会发出一条消息,例如“这将永远不会发生”,我通常认为原来的程序员做错了什么。
The current solution works, but it means that a mistake in the program would be caught at runtime, not as a compiler error as would be preferable. 当前的解决方案有效,但这意味着程序中的错误将在运行时捕获,而不是作为编译器错误而不是更好。
Ideally, I'd like to be able to annotate the function like this: 理想情况下,我希望能够像这样注释函数:
let print_the_string (x : some_type.Variant2) =
But, of course, that's not allowed. 但是,当然,这是不允许的。
Question: Is there a way to cause a compiler error in any case where Variant1
was passed to print_the_string
? 问:有没有办法使编译器错误在任何情况下Variant1
传递给print_the_string
?
A related question was asked here , but nlucarioni and Thomas's answers simply address cleaner ways to handle incorrect calls. 这里提出了一个相关的问题,但是nlucarioni和Thomas的答案只是解决了处理不正确呼叫的更简洁方法。 My goal is to have the program fail more obviously, not less. 我的目标是让程序更明显地失败,而不是更少。
Update: I'm accepting gallais's solution as, after playing with it, it seems like the cleanest way to implement something like this. 更新:我接受了加莱的解决方案,因为在玩完它后,它似乎是实现这样的最干净的方式。 Unfortunately, without a very messy wrapper, I don't believe any of the solutions work in the case where I cannot modify the original definition of some_type
. 不幸的是,如果没有非常混乱的包装器,我不相信任何解决方案都适用于我无法修改some_type
的原始定义的some_type
。
There is not enough information in your post to decide whether what follows could be useful for you. 您的帖子中没有足够的信息来决定以下内容是否对您有用。 This approach is based on propagating an invariant and will play nicely if your code is invariant-respecting. 这种方法基于传播一个不变量,并且如果你的代码是不变的,它将很好地发挥作用。 Basically, if you do not have functions of type some_type -> some_type
which turn values using Variant2
as their head constructor into ones constructed using Variant1
then you should be fine with this approach. 基本上,如果你没有类型的函数some_type -> some_type
那一转用值Variant2
作为他们的头部构造成利用人工那些Variant1
那么你应该罚款这种方法。 Otherwise it gets pretty annoying pretty quickly. 否则很快就会变得很烦人。
Here we are going to encode the invariant "is built using Variant2
" into the type by using phantom types and defining some_type
as a GADT . 在这里,我们将通过使用幻像类型并将some_type
定义为GADT来将不变“使用Variant2
”编码到类型中。 We start by declaring types whose sole purpose is to play the role of tags. 我们首先声明其唯一目的是扮演标签角色的类型。
type variant2
type variantNot2
Now, we can use these types to record which constructor was used to produce a value of some_type
. 现在,我们可以使用这些类型来记录用于生成some_type
值的some_type
。 This is the GADT syntax in Ocaml; 这是Ocaml中的GADT语法; it's just slightly different from the ADT one in the sense that we can declare what the return type of a constructor is and different constructors can have different return types. 它与ADT略有不同,因为我们可以声明构造函数的返回类型是什么,不同的构造函数可以有不同的返回类型。
type _ some_type =
| Variant1 : int -> variantNot2 some_type
| Variant2 : int * string -> variant2 some_type
One could also throw in a couple of extra constructors as long as their signature records the fact their are not Variant2
. 只要他们的签名记录了他们不是Variant2
的事实,人们也可以投入一些额外的构造函数。 I won't deal with them henceforth but you can try to extend the definitions given below so that they'll work well with these extra constructors. 我今后不会处理它们,但你可以尝试扩展下面给出的定义,以便它们能够很好地处理这些额外的构造函数。 You can even add a print_the_second_int
which will only take Variant3
and Variant4
as inputs to check that you get the idea behind this. 你甚至可以添加一个print_the_second_int
,它只Variant3
和Variant4
作为输入来检查你是否明白这一点。
| Variant3 : int * int -> variantNot2 some_type
| Variant4 : float * int -> variantNot2 some_type
Now, the type of print_the_string
can be extremely precise: we are only interested in elements of some_type
which have been built using the constructor Variant2
. 现在,类型print_the_string
可以极为精确:我们只在元素感兴趣some_type
已使用构造内置Variant2
。 In other words, the input of print_the_string
should have type variant2 some_type
. 换句话说, print_the_string
的输入应该具有variant2 some_type
类型。 And the compiler can check statically that Variant2
is the only constructor possible for values of that type. 并且编译器可以静态地检查Variant2
是该类型的值的唯一可能的构造函数。
let print_the_string (x : variant2 some_type) : unit =
match x with Variant2 (_, s) -> print_string s
Ok. 好。 But what if we have a value of type 'a some_type
because it was handed over to us by a client; 但是如果我们有一个'a some_type
类型的值,因为它是由客户交给我们的呢? we built it tossing a coin; 我们建造它扔硬币; etc.? 等等。? Well, there's no magic there: if you want to use print_the_string
, you need to make sure that this value has been built using a Variant2
constructor. 好吧,那里没有魔力:如果你想使用print_the_string
,你需要确保使用Variant2
构造函数构建了这个值。 You can either try to cast the value to a variant2 some_type
one (but this may fail, hence the use of the option
type): 您可以尝试将值转换为variant2 some_type
one(但这可能会失败,因此使用option
类型):
let fromVariant2 : type a. a some_type -> (variant2 some_type) option = function
| Variant2 _ as x -> Some x
| Variant1 _ -> None
Or (even better!) decide in which realm the value lives: 或者(甚至更好!)决定价值在哪个领域:
type ('a, 'b) either = | Left of 'a | Right of 'b
let em : type a. a some_type -> (variant2 some_type, variantNot2 some_type) either =
fun x -> match x with
| Variant1 _ -> Right x
| Variant2 _ -> Left x
我的解决方案是使用print_the_string : int * string -> unit
,因为Variant2
部分没有提供任何信息,你应该放弃它。
The type inference works toward inferring types (obviously) not values of types. 类型推断适用于推断类型(显然)不是类型的值 。 But you can do what you propose with polymorphic variants. 但是你可以用多态变体做你提出的建议。 Although, I agree with Thomash. 虽然,我同意Thomash。
type v1 = [ `Variant1 of int ]
type v2 = [ `Variant2 of int * string ]
let print_the_string (`Variant1 x) = ()
Gallais provided an excellent, but long answer, so I've decided to add a shorter version. 加拉提供了一个很好但很长的答案,所以我决定添加一个更短的版本。
If you have a variant type and would like to add functions that works only on a subset of variants, then you can use GADTS. 如果您有变体类型并且想要添加仅适用于变体子集的函数,那么您可以使用GADTS。 Consider the example: 考虑这个例子:
open Core.Std
type _ t =
| Int: int -> int t
| Str: string -> string t
let str s = Str s
let uppercase (Str s) = Str (String.uppercase s)
Function uppercase
has type string t -> string t
and accepts only string version of type t
, so you can deconstruct the variant just in place. 函数uppercase
类型为string t -> string t
,只接受类型为t
字符串版本,因此您可以在适当的位置解构变量。 Function str
has type string -> string t
, so that the return type carries in itself an information (a witness type) that the only possible variant, produced from this function is Str
. 函数str
具有类型string -> string t
,因此返回类型本身带有一个信息(见证类型),该函数生成的唯一可能的变量是Str
。 So when you have a value that has such type, you can easily deconstruct it without using explicit pattern-matching, since it becomes irrefutable, ie, it can't fail. 因此,当您拥有具有此类型的值时,您可以轻松地解构它而不使用显式模式匹配,因为它变得无可辩驳,即它不会失败。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.