简体   繁体   中英

Type constraints on dimensionality of vectors in F# and Haskell (Dependent Types)

I'm new to F# and Haskell and am implementing a project in order to determine which language I would prefer to devote more time to.

I have a numerous situations where I expect a given numerical type to have given dimensions based on parameters given to a top-level function (ie, at runtime). For example, in this F# snippet, I have

type DataStreamItem = LinearAlgebra.Vector<float32>

type Ball =
    {R : float32;
     X : DataStreamItem}

and I expect all instances of type DataStreamItem to have D dimensions.

My question is in the interests of algorithm development and debugging since such shape-mismatche-bugs can be a headache to pin down but should be a non-issue when the algorithm is up-and-running:

Is there a way, in either F# or Haskell , to constrain DataStreamItem and / or Ball to have dimensions of D ? Or do I need to resort to pattern matching on every calculation?

If the latter is the case, are there any good, light-weight paradigms to catch such constraint violations as soon as they occur (and that can be removed when performance is critical)?

Edit:

To clarify the sense in which D is constrained:

D is defined such that if you expressed the algorithm of the function main(DataStream) as a computation graph, all of the intermediate calculations would depend on the dimension of D for the execution of main(DataStream) . The simplest example I can think of would be a dot-product of M with DataStreamItem : the dimension of DataStream would determine the creation of dimension parameters of M

Another Edit:

A week later, I find the following blog outlining precisely what I was looking for in dependant types in Haskell:

https://blog.jle.im/entry/practical-dependent-types-in-haskell-1.html

And Another: This reddit contains some discussion on Dependent Types in Haskell and contains a link to the quite interesting dissertation proposal of R. Eisenberg.

Neither Haskell not F# type system is rich enough to (directly) express statements of the sort " N nested instances of a recursive type T, where N is between 2 and 6 " or " a string of characters exactly 6 long ". Not in those exact terms, at least.

I mean, sure, you can always express such a 6-long string type as type String6 = String6 of char*char*char*char*char*char or some variant of the sort (which technically should be enough for your particular example with vectors, unless you're not telling us the whole example), but you can't say something like type String6 = s:string{s.Length=6} and, more importantly, you can't define functions of the form concat: String<n> -> String<m> -> String<n+m> , where n and m represent string lengths.

But you're not the first person asking this question . This research direction does exist, and is called " dependent types ", and I can express the gist of it most generally as " having higher-order, more powerful operations on types " (as opposed to just union and intersection, as we have in ML languages) - notice how in the example above I parametrize the type String with a number, not another type, and then do arithmetic on that number.

The most prominent language prototypes (that I know of) in this direction are Agda , Idris , F* , and Coq (not really the full deal AFAIK). Check them out, but beware: this is kind of the edge of tomorrow, and I wouldn't advise starting a big project based on those languages.

(edit: apparently you can do certain tricks in Haskell to simulate dependent types, but it's not very convenient, and you have to enable UndecidableInstances )

Alternatively , you could go with a weaker solution of doing the checks at runtime. The general gist is: wrap your vector types in a plain wrapper, don't allow direct construction of it, but provide constructor functions instead, and make those constructor functions ensure the desired property (ie length). Something like:

type Stream4 = private Stream4 of DataStreamItem
   with
      static member create (item: DataStreamItem) =
         if item.Length = 4 then Some (Stream4 item)
         else None

         // Alternatively:
         if item.Length <> 4 then failwith "Expected a 4-long vector."
         item

Here is a fuller explanation of the approach from Scott Wlaschin: constrained strings .

So if I understood correctly, you're actually not doing any type-level arithmetic, you just have a “length tag” that's shared in a chain of function calls.

This has long been possible to do in Haskell; one way that I consider quite elegant is to annotate your arrays with a standard fixed-length type of the desired length:

newtype FixVect v s = FixVect { getFixVect :: VU.Vector s }

To ensure the correct length, you only provide (polymorphic) smart constructors that construct from the fixed-length type – perfectly safe, though the actual dimension number is nowhere mentioned!

class VectorSpace v => FiniteDimensional v where
  asFixVect :: v -> FixVect v (Scalar v)

instance FiniteDimensional Float where
  asFixVect s = FixVect $ VU.singleton s
instance (FiniteDimensional a, FiniteDimensional b, Scalar a ~ Scalar b)        => FiniteDimensional (a,b) where
  asFixVect (a,b) = case (asFixVect a, asFixVect b) of
        (FixVect av, FixVect bv) -> FixVect $ av<>bv

This construction from unboxed tuples is really inefficient, however this doesn't mean you can write efficient programs with this paradigm – if the dimension always stays constant, you only need to wrap and unwrap the once and can do all the critical operations through safe yet runtime-unchecked zips, folds and LA combinations.

Regardless, this approach isn't really widely used. Perhaps the single constant dimension is in fact too limiting for most relevant operations, and if you need to unwrap to tuples often it's way too inefficient. Another approach that is taking off these days is to actually tag the vectors with type-level numbers . Such numbers have become available in a usable form with the introduction of data kinds in GHC-7.4. Up until now, they're still rather unwieldy and not fit for proper arithmetic, but the upcoming 8.0 will greatly improve many aspects of this dependently-typed programming in Haskell.

A library that offers efficient length-indexed arrays is linear .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM