简体   繁体   中英

How to return a polymorphic type in Haskell based on the results of string parsing?

TL;DR:
How can I write a function which is polymorphic in its return type? I'm working on an exercise where the task is to write a function which is capable of analyzing a String and, depending on its contents, generate either a Vector [Int] , Vector [Char] or Vector [String] .

Longer version:
Here are a few examples of how the intended function would behave:

  • The string "1 2\\n3 4" would generate a Vector [Int] that's made up of two lists: [1,2] and [3,4] .

  • The string "'t' 'i' 'c'\\n't' 'a' 'c'\\n't' 'o' 'e'" would generate a Vector [Char] (ie, made up of the lists "tic" , "tac" and "toe" ).

  • The string "\\"hello\\" \\"world\\"\\n\\"monad\\" \\"party\\"" would generate a Vector [String] (ie, ["hello","world"] and ["monad","party"] ).

Error-checking/exception handling is not a concern for this particular exercise. At this stage, all testing is done purely, ie, this isn't in the realm of the IO monad.

What I have so far:

I have a function (and new datatype) which is capable of classifying a string. I also have functions (one for each Int , Char and String ) which can convert the string into the necessary Vector.

What I've tried:

  • (It obviously doesn't typecheck if I stuff the three conversion functions into a single function (ie, using a case..of structure to pattern match on VectorType of the string.

  • I tried making a Vectorable class and defining a separate instance for each type; I quickly realized that this approach only works if the functions' arguments vary by type. In our case, the the type of the argument doesn't vary (ie, it's always a String ).

My code:

A few comments

  • Parsing: the mySplitter object and the mySplit function handle the parsing. It's admittedly a crude parser based on the Splitter type and the split function from Data.List.Split.Internals .

  • Classifying: The classify function is capable of determining the final VectorType based on the string.

  • Converting: The toVectorNumber , toVectorChar and toVectorString functions are able to convert a string to type Vector [Int] , Vector [Char] and Vector [String] , respectively.

  • As a side note, I'm trying out CorePrelude based on a recommendation from a mentor. That's why you'll see me use the generalized versions of the normal Prelude functions.

Code:

import qualified Prelude
import CorePrelude                   

import Data.Foldable (concat, elem, any)
import Control.Monad (mfilter)
import Text.Read (read)
import Data.Char (isAlpha, isSpace)

import Data.List.Split (split)
import Data.List.Split.Internals (Splitter(..), DelimPolicy(..), CondensePolicy(..), EndPolicy(..), Delimiter(..))

import Data.Vector ()                       
import qualified Data.Vector as V           

data VectorType = Number | Character | TextString deriving (Show)

mySplitter :: [Char] -> Splitter Char
mySplitter elts = Splitter { delimiter        = Delimiter [(`elem` elts)]
                           , delimPolicy      = Drop
                           , condensePolicy   = Condense
                           , initBlankPolicy  = DropBlank
                           , finalBlankPolicy = DropBlank }

mySplit :: [Char]-> [Char]-> [[Char]]
mySplit delims = split (mySplitter delims)           

classify :: String -> VectorType
classify xs
  | '\"' `elem` cs = TextString
  | hasAlpha cs = Character
  | otherwise = Number
  where
    cs = concat $ split (mySplitter "\n") xs
    hasAlpha = any isAlpha . mfilter (/=' ')

toRows :: [Char] -> [[Char]]
toRows = mySplit "\n"

toVectorChar ::    [Char] -> Vector [Char]
toVectorChar =   let toChar = concat . mySplit " \'" 
                 in V.fromList . fmap (toChar) . toRows

toVectorNumber  :: [Char] -> Vector [Int]
toVectorNumber = let toNumber = fmap (\x -> read x :: Int) . mySplit " "
                 in  V.fromList . fmap toNumber . toRows

toVectorString  :: [Char] -> Vector [[Char]]
toVectorString = let toString = mfilter (/= " ") . mySplit "\""
                 in  V.fromList . fmap toString . toRows

You can't.

Covariant polymorphism is not supported in Haskell, and wouldn't be useful if it were.


That's basically all there is to answer. Now as to why this is so.

It's no good "returning a polymorphic value" like OO languages so like to do, because the only reason to return any value at all is to use it in other functions . Now, in OO languages you don't have functions but methods that come with the object , so it's quite easy to "return different types": each will have its suitable methods built-in, and they can per instance vary. (Whether that's a good idea is another question.)

But in Haskell, the functions come from elsewhere. They don't know about implementation changes for a particular instance, so the only way such functions can safely be defined is to know every possible implementation . But if your return type is really polymorphic, that's not possible, because polymorphism is an "open" concept (it allows new implementation varieties to be added any time later).

Instead, Haskell has a very convenient and totally safe mechanism of describing a closed set of "instances" – you've actually used it yourself already! ADTs.

data PolyVector = NumbersVector (Vector [Int])
                | CharsVector (Vector [Char])
                | StringsVector (Vector [String])

That's the return type you want. The function won't be polymorphic as such, it'll simply return a more versatile type.


If you insist it should be polymorphic

Now... actually , Haskell does have a way to sort-of deal with "polymorphic returns". As in OO when you declare that you return a subclass of a specified class. Well, you can't "return a class" at all in Haskell, you can only return types. But those can be made to express "any instance of...". It's called existential quantification .

{-# LANGUAGE GADTs #-}

data PolyVector' where
  PolyVector :: YourVElemClass e => Vector [e] -> PolyVector'

class YourVElemClass where
  ...?
instance YourVElemClass Int
instance YourVElemClass Char
instance YourVElemClass String

I don't know if that looks intriguing to you. Truth is, it's much more complicated and rather harder to use; you can't just just any of the possible results directly but can only make use of the elements through methods of YourVElemClass . GADTs can in some applications be extremely useful, but these usually involve classes with very deep mathematical motivation. YourVElemClass doesn't seem to have such a motivation, so you'll be much better off with a simple ADT alternative, than existential quantification.

There's a famous rant against existentials by Luke Palmer (note he uses another syntax, existential-specific, which I consider obsolete, as GADTs are strictly more general).

Easy, use an sum type!

data ParsedVector = NumberVector (Vector [Int]) | CharacterVector (Vector [Char]) | TextString (Vector [String]) deriving (Show)

parse :: [Char] -> ParsedVector
parse cs = case classify cs of
  Number     -> NumberVector $ toVectorNumber cs
  Character  -> CharacterVector $ toVectorChar cs
  TextString -> TextStringVector $ toVectorString cs

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM