简体   繁体   中英

Using F# Units of Measure with System.Numerics.Vector<T>

I struggle using F# units of measure in combination with the System.Numerics.Vector<'T> type. Let's have a look at a toy problem: Assume we have an array xs of type float<m>[] and for some reason we wanted to square all its components, resulting in an array of type float<m^2>[] . That works perfectly well with scalar code:

xs |> Array.map (fun x -> x * x) // float<m^2>[]

Now suppose we wanted to vectorize this operation by performing the multiplication in System.Numerics.Vector<float>.Count -sized chunks using SIMD, for instance like so:

open System.Numerics
let simdWidth = Vector<float>.Count
// fill with dummy data
let xs = Array.init (simdWidth * 10) (fun i -> float i * 1.0<m>)
// array to store the results
let rs: float<m^2> array = Array.zeroCreate (xs |> Array.length)
// number of SIMD operations required
let chunks = (xs |> Array.length) / simdWidth
// for simplicity, assume xs.Length % simdWidth = 0
for i = 0 to chunks - 1 do
    let v = Vector<_>(xs, i * simdWidth) // Vector<float<m>>, containing xs.[i .. i+simdWidth-1]
    let u = v * v                        // Vector<float<m>>; expected: Vector<float<m^2>>
    u.CopyTo(rs, i * simdWidth)          // units mismatch

I believe I understand why this happens: How could the F# compiler know, what System.Numerics.Vector<'T>.op_Multiply does and what arithmetic rules apply? It could literally be any operation. So how should it be able to deduce the correct units?

The question is : What the best way to make this work? How can we tell the compiler which rules apply?

Attempt 1 : Remove all units of measure information from xs and add it back later on:

// remove all UoM from all arrays
let xsWoM = Array.map (fun x -> x / 1.0<m>) xs
// ...
// perform computation using xsWoM etc.
// ...
// add back units again
let xs = Array.map (fun x -> x * 1.0<m>) xsWoM

Problems: Performs unnecessary computations and/or copy operations, defeats the purpose of vectorizing the code for performance reasons. Also, largely defeats the purpose of using UoM to begin with.

Attempt 2 : Use inline IL to change the return type of Vector<'T>.op_Multiply :

// reinterpret x to be of type 'b
let inline retype (x: 'a) : 'b = (# "" x: 'b #)
let inline (.*.) (u: Vector<float<'m>>) (v: Vector<float<'m>>): Vector<float<'m^2>> = u * v |> retype
// ...
let u = v .*. v // asserts type Vector<float<m^2>>

Problems: Doesn't require any additional operations, but uses a deprecated feature (inline IL) and isn't fully generic (only with respect to the unit of measure).

Does anyone have a better solution for this* ?

*Note that the above example really is a toy problem to demonstrate the general issue. The real program solves a much more complicate initial value problem, involving many kinds of physical quantities.

The compiler is fine with figuring out how to apply unit rules for multiplication, the problem here is that you have a wrapped type. In your first example, when you write xs |> Array.map (fun x -> x * x) , you're describing the multiplication in terms of the elements of the array rather than on the array directly.

When you have a Vector<float<m>> , the units are attached to the float rather than the Vector so when you try and multiply Vector s, the compiler won't treat that type as if it has any units.

Given the methods exposed by the class, I don't think there is an easy workaround using the Vector<'T> directly but there are options for wrapping the type.

Something like this could give you a unit-friendly vector:

type VectorWithUnits<'a, [<Measure>]'b> = 
    |VectorWithUnits of Vector<'a>

    static member inline (*) (a : VectorWithUnits<'a0, 'b0>, b : VectorWithUnits<'a0, 'b1>) 
        : VectorWithUnits<'a0, 'b0*'b1> =
        match a, b with
        |VectorWithUnits a, VectorWithUnits b -> VectorWithUnits <| a * b

In this case, the units are attached to the vector and multiplying vectors works as expected with the correct unit behaviour.

The problem is that now we can have separate and distinct unit of measure annotations on the Vector<'T> and on the float itself.

You can the turn an array of a specific type with units of measure into a set of the Vector s using:

let toFloatVectors (array : float<'m>[]) : VectorWithUnits<float,'m>[]  =
    let arrs = array |> Array.chunkBySize (Vector<float>.Count)
    arrs |> Array.map (Array.map (float) >> Vector >> VectorWithUnits)

and back:

let fromFloatVectors (vectors : VectorWithUnits<float,'m>[]) : float<'m>[] =
    let arr = Array.zeroCreate<float> (Array.length vectors)
    vectors |> Array.iteri (fun i uVec ->
        match uVec with
        |VectorWithUnits vec -> vec.CopyTo arr)
    arr |> Array.map (LanguagePrimitives.FloatWithMeasure<'m>)

A hacky alternative:

If you give up the generic type 'T you can make a float Vector behave properly through some fairly horrible boxing and runtime casting. This abuses the fact that units of measure are a compile time construct that no longer exist at runtime.

type FloatVectorWithUnits<[<Measure>]'b> = 
    |FloatVectorWithUnits of Vector<float<'b>>

    static member ( * ) (a : FloatVectorWithUnits<'b0>, b : FloatVectorWithUnits<'b1>) =
        match a, b with
        |FloatVectorWithUnits a, FloatVectorWithUnits b ->
            let c, d = box a :?> Vector<float<'b0*'b1>>, box b :?> Vector<float<'b0*'b1>>
            c * d |> FloatVectorWithUnits

I came up with a solution, that meets most of my requirements (it seems). It is inspired by TheInnerLight's ideas (wrapping Vector<'T> ), but also adds a wrapper (called ScalarField ) for the underlying array data type. This way we can track the units, while underneath we only handle raw data and can use the not unit-aware System.Numerics.Vector APIs.

A simplified, bare-bones quick and dirty implementation would look like this:

// units-aware wrapper for System.Numerics.Vector<'T>
type PackedScalars<[<Measure>] 'm> = struct
    val public Data: Vector<float>
    new (d: Vector<float>) = {Data = d}
    static member inline (*) (u: PackedScalars<'m1>, v: PackedScalars<'m2>) = u.Data * v.Data |> PackedScalars<'m1*'m2>
end

// unit-ware type, wrapping a raw array for easy stream processing
type ScalarField<[<Measure>] 'm> = struct
    val public Data: float[]
    member self.Item with inline get i                = LanguagePrimitives.FloatWithMeasure<'m> self.Data.[i]
                     and  inline set i (v: float<'m>) = self.Data.[i] <- (float v)
    member self.Packed 
           with inline get i                        = Vector<float>(self.Data, i) |> PackedScalars<'m>
           and  inline set i (v: PackedScalars<'m>) = v.Data.CopyTo(self.Data, i)
    new (d: float[]) = {Data = d}
    new (count: int) = {Data = Array.zeroCreate count}
end

We can now use both data structures to solve the sample problem in a relatively elegant, efficient way:

let xs = Array.init (simdWidth * 10) float |> ScalarField<m>    
let mutable rs = Array.zeroCreate (xs.Data |> Array.length) |> ScalarField<m^2>
let chunks = (xs.Data |> Array.length) / simdWidth
for i = 0 to chunks - 1 do
    let j = i * simdWidth
    let v = xs.Packed(j) // PackedScalars<m>
    let u = v * v        // PackedScalars<m^2>
    rs.Packed(j) <- u

On top, it might be useful to re-implement usual array operations for the unit-aware ScalarField wrapper, eg

[<CompilationRepresentation(CompilationRepresentationFlags.ModuleSuffix)>]
module ScalarField =
    let map f (sf: ScalarField<_>) =
        let mutable res = Array.zeroCreate sf.Data.Length |> ScalarField
        for i = 0 to sf.Data.Length do
           res.[i] <- f sf.[i]
        res

etc.

Drawbacks: Not generic with respect to the underlying numeric type ( float ), because there is no generic replacement for floatWithMeasure . To make it generic, we have to implement a third wrapper Scalar the also wraps the underlying primitive:

type Scalar<'a, [<Measure>] 'm> = struct
    val public Data: 'a
    new (d: 'a) = {Data = d}
end

type PackedScalars<'a, [<Measure>] 'm 
            when 'a: (new: unit -> 'a) 
            and  'a: struct 
            and  'a :> System.ValueType> = struct
    val public Data: Vector<'a>
    new (d: Vector<'a>) = {Data = d}
    static member inline (*) (u: PackedScalars<'a, 'm1>, v: PackedScalars<'a, 'm2>) = u.Data * v.Data |> PackedScalars<'a, 'm1*'m2>
end

type ScalarField<'a, [<Measure>] 'm
            when 'a: (new: unit -> 'a) 
            and  'a: struct 
            and  'a :> System.ValueType> = struct
    val public Data: 'a[]
    member self.Item with inline get i                    = Scalar<'a, 'm>(self.Data.[i])
                     and  inline set i (v: Scalar<'a,'m>) = self.Data.[i] <- v.Data
    member self.Packed 
           with inline get i                          = Vector<'a>(self.Data, i) |> PackedScalars<_,'m>
           and  inline set i (v: PackedScalars<_,'m>) = v.Data.CopyTo(self.Data, i)
    new (d:'a[]) = {Data = d}
    new (count: int) = {Data = Array.zeroCreate count}
end

... meaning that we basically track the units not by using "refinement" types like float<'m> , but solely through wrapper types with a secondary type/units argument.

I'm still hoping someone comes up with a better idea, though. :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM