简体   繁体   中英

Cassava parsing error in haskell

Im trying to convert a csv into a vector using cassava. The csv Im trying to convert is the fischer iris data set, used for machine learning. It consists of four doubles and one string. My code is the following:

{-# LANGUAGE OverloadedStrings #-}

module Main where
import Data.Csv
import qualified Data.ByteString.Lazy as BS
import qualified Data.Vector as V

data Iris = Iris
  { sepal_length  :: !Double
  , sepal_width   :: !Double
  , petal_length  :: !Double
  , petal_width   :: !Double
  , iris_type     :: !String
 } deriving (Show, Eq, Read)

instance FromNamedRecord Iris where
  parseNamedRecord r =
    Iris
      <$> r .: "sepal_length"
      <*> r .: "sepal_width"
      <*> r .: "petal_length"
      <*> r .: "petal_width"
      <*> r .: "iris_type"

printIris :: Iris -> IO ()
printIris r  = putStrLn $  show (sepal_length r) ++ show (sepal_width r)
   ++ show(petal_length r) ++ show(petal_length r) ++ "hola"

main :: IO ()
main = do
  csvData <- BS.readFile "./iris/test-iris"
  print csvData
  case decodeByName csvData of
    Left err -> putStrLn err
    -- forM : O(n) Apply the monadic action to all elements of the vector,
    -- yielding a vector of results.
    Right (h, v) -> V.forM_ v $ printIris

When I run this, it seems as if the csvData is correctly formatted, the first lines from the print csvData return the following:

"5.1,3.5,1.4,0.2,Iris-setosa\n4.9,3.0,1.4,0.2,Iris- setosa\n4.7,3.2,1.3,0.2,Iris-setosa\n4.6,3.1,1.5,0.2,Iris-setosa\n5.0,3.6,1.4,0.2,Iris-setosa\n5.4,3.9,1.7,0.4,Iris-setosa\n4.6,3.4,1.4,0.3,Iris-setosa\n5.0,3.4,1.5,0.2,Iris-setosa\n4.4,2.9,1.4,0.2,Iris-setosa\n4.9,3.1,1.5,0.1,Iris-setosa\n5.4,3.7,1.5,0.2,Iris-setosa\n4.8,3.4,1.6,0.2,Iris-setosa\n4.8,3.0,1.4,0.1,Iris-setosa\n4.3,3.0,1.1,0.1,Iris-setosa\n5.8,4.0,1.2,0.2,Iris-setosa\n5.7,4.4,1.5,0.4,Iris-set

But I get the following error:

parse error (Failed reading: conversion error: no field named "sepal_length")  at 
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4 (truncated)

Does anybody have any idea as to why I can be getting this error? The csv has no missing values, and if I replace the line which produces the error for another row I get the same error.

It appears your data does not have a header, which is assumed by decodeByName

The data is assumed to be preceeded by a header.

Add a header, or use decode NoHeader and the FromRecord type class.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM