简体   繁体   中英

Accepting only one variant of sum type as OCaml function parameter

I have a large sum type originating in existing code. Let's say it looks like this:

type some_type =
  | Variant1 of int
  | Variant2 of int * string

Although both Variant1 and Variant2 are used elsewhere, I have a specific function that only operates on Variant2 :

let print_the_string x =
  match x with
  | Variant2(a,s) -> print_string s; ()
  | _ -> raise (Failure "this will never happen"); ()

Since this helper function is only called from one other place, it is easy to show that it will always be called with an input of Variant2 , never with an input of Variant1 .

Let's say the call looks like this:

let () =
  print_the_string (Variant2(1, "hello\n"))

If Variant1 and Variant2 were separate types, I would expect OCaml to infer the type Variant2 -> () for print_the_string , however, since they are both variants of the same sum type, OCaml infers the signature some_type -> () .

When I encounter a program that throws an exception with a message like "this will never happen," I usually assume the original programmer did something wrong.

The current solution works, but it means that a mistake in the program would be caught at runtime, not as a compiler error as would be preferable.

Ideally, I'd like to be able to annotate the function like this:

let print_the_string (x : some_type.Variant2) =

But, of course, that's not allowed.

Question: Is there a way to cause a compiler error in any case where Variant1 was passed to print_the_string ?

A related question was asked here , but nlucarioni and Thomas's answers simply address cleaner ways to handle incorrect calls. My goal is to have the program fail more obviously, not less.


Update: I'm accepting gallais's solution as, after playing with it, it seems like the cleanest way to implement something like this. Unfortunately, without a very messy wrapper, I don't believe any of the solutions work in the case where I cannot modify the original definition of some_type .

There is not enough information in your post to decide whether what follows could be useful for you. This approach is based on propagating an invariant and will play nicely if your code is invariant-respecting. Basically, if you do not have functions of type some_type -> some_type which turn values using Variant2 as their head constructor into ones constructed using Variant1 then you should be fine with this approach. Otherwise it gets pretty annoying pretty quickly.

Here we are going to encode the invariant "is built using Variant2 " into the type by using phantom types and defining some_type as a GADT . We start by declaring types whose sole purpose is to play the role of tags.

type variant2
type variantNot2

Now, we can use these types to record which constructor was used to produce a value of some_type . This is the GADT syntax in Ocaml; it's just slightly different from the ADT one in the sense that we can declare what the return type of a constructor is and different constructors can have different return types.

type _ some_type =
  | Variant1 : int          -> variantNot2 some_type
  | Variant2 : int * string -> variant2    some_type

One could also throw in a couple of extra constructors as long as their signature records the fact their are not Variant2 . I won't deal with them henceforth but you can try to extend the definitions given below so that they'll work well with these extra constructors. You can even add a print_the_second_int which will only take Variant3 and Variant4 as inputs to check that you get the idea behind this.

  | Variant3 : int * int    -> variantNot2 some_type
  | Variant4 : float * int  -> variantNot2 some_type

Now, the type of print_the_string can be extremely precise: we are only interested in elements of some_type which have been built using the constructor Variant2 . In other words, the input of print_the_string should have type variant2 some_type . And the compiler can check statically that Variant2 is the only constructor possible for values of that type.

let print_the_string (x : variant2 some_type) : unit =
  match x with Variant2 (_, s) -> print_string s

Ok. But what if we have a value of type 'a some_type because it was handed over to us by a client; we built it tossing a coin; etc.? Well, there's no magic there: if you want to use print_the_string , you need to make sure that this value has been built using a Variant2 constructor. You can either try to cast the value to a variant2 some_type one (but this may fail, hence the use of the option type):

let fromVariant2 : type a. a some_type -> (variant2 some_type) option = function
  | Variant2 _ as x -> Some x
  | Variant1 _      -> None

Or (even better!) decide in which realm the value lives:

type ('a, 'b) either = | Left  of 'a | Right of 'b

let em : type a. a some_type -> (variant2 some_type, variantNot2 some_type) either =
   fun x -> match x with
   | Variant1 _ -> Right x
   | Variant2 _ -> Left x

我的解决方案是使用print_the_string : int * string -> unit ,因为Variant2部分没有提供任何信息,你应该放弃它。

The type inference works toward inferring types (obviously) not values of types. But you can do what you propose with polymorphic variants. Although, I agree with Thomash.

 type v1 = [ `Variant1 of int ]
 type v2 = [ `Variant2 of int * string ]

 let print_the_string (`Variant1 x) = ()

Gallais provided an excellent, but long answer, so I've decided to add a shorter version.

If you have a variant type and would like to add functions that works only on a subset of variants, then you can use GADTS. Consider the example:

open Core.Std

type _ t =
  | Int: int -> int t
  | Str: string -> string t

let str s = Str s

let uppercase (Str s) = Str (String.uppercase s)

Function uppercase has type string t -> string t and accepts only string version of type t , so you can deconstruct the variant just in place. Function str has type string -> string t , so that the return type carries in itself an information (a witness type) that the only possible variant, produced from this function is Str . So when you have a value that has such type, you can easily deconstruct it without using explicit pattern-matching, since it becomes irrefutable, ie, it can't fail.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM