简体   繁体   中英

Haskell polymorphism and typeclass instance

I am trying to write a machine learning library in Haskell, to work on my Haskell skills. I thought about a general design involving a class which is like so:

  class Classifier classifier where
    train :: X -> y -> trainingData
    classify :: trainingData -> x -> y

For example, given a set of examples X, and their true labels y, train returns trainingData which is used in the classify function.

So, if I want to implement KNN, I would do it like so:

data KNN = KNN Int (Int -> Int -> Float) 

Where the first int is the number of neighbors and the function its the metric that calculate the distance between the vectors

  instance Classifier KNN where
---This is where I am stuck---

How can I implement the Classifier type class function so they would be generic to all of the classifier that I will create? I am feeling like I am treating Haskell too much like an imperative OOP like language and I'd like to do this the Haskell way.

I would say you need multi-parameter type classes (with optional functional dependencies, or type families; I omit those).

 class Classifier c s l  k where
      train :: c -> [(s, l)] -> k
      classify :: c -> k -> s -> l
      combine :: c -> k -> k -> k

There is a four-sided relationship between classifier, sample, label and knowledge types.

The train method derives some knowledge (k) from a set of sample (s) — label (l) pairs. The classify method uses that knowledge to infer a label for a sample. (The combine method joins two pieces of knowledge together; don't know if it always applies).

Assuming your type class has no knowledge of what a classifier provides, you could do something like

class Classifier c where
  train :: [x] -> [y] -> c -> [(x,y)]
  classify :: [(x,y)] -> c -> x > y

Here, train is getting a list of samples of type x , a list of labels of type y , and a classifier of some type c , and needs to return a list of sample/label pairs.

classify takes a list of sample/label pairs (such as that produced by train ), the classifier, and a sample, and produces a new label.

(At the very least, though, I'd probably replace [(x,y)] with something like Map xy .)

The key is that the classifier itself needs to be used by both train and classify , although you don't need to know what that would look like at this time.

Your instance for KNN could then look like

instance Classifier KNN where

  train samples labels (KNN n f) = ...
  classify td (KNN n f) sample = ...

Here, n and f can be used both to create the training data, and to help pick the closest member of the training data for a sample point.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM