简体   繁体   English

如何使用haskell parsec库解析带有applicative仿函数的Intel Hex Record?

[英]How do you parse an Intel Hex Record with applicative functors using the haskell parsec library?

I would like to parse an Intel Hex Record with parsec using the applicative functor style. 我想用parsec使用applicative functor样式解析Intel Hex Record。 A typical records looks like the following: 典型记录如下所示:

:10010000214601360121470136007EFE09D2190140 :10010000214601360121470136007EFE09D2190140

The first character is always ':', the next two characters are a hex string representing the number of bytes in the record. 第一个字符始终为':',接下来的两个字符是十六进制字符串,表示记录中的字节数。 The next four characters are a hex string identifying the start address of the data. 接下来的四个字符是一个十六进制字符串,用于标识数据的起始地址。 I had code like the following, but I don't know how to applicatively pass the byte count to the parser that parses the data bytes. 我有类似下面的代码,但我不知道如何应用程序将字节数传递给解析数据字节的解析器。 My non-working code looks like the following. 我的非工作代码如下所示。

line = startOfRecord . byteCount . address . recordType . recordData . checksum
startOfRecord = char ':'
byteCount = toHexValue <$> count 2 hexDigit
address = toHexValue <$> count 4 hexDigit
recordType = toHexValue <$> count 2 hexDigit
recordData c = toHexValue <$> count c hexDigit
recordData c CharParser = count c hexDigit
checksum = toHexValue <$> count 2 hexDigit

toHexValue :: String -> Int
toHexValue = fst . head . readHex

Could anyone help me? 谁能帮助我? Thanks. 谢谢。

There are a number of things not included in your question that you need in order to use parsec. 为了使用parsec,您的问题中没有包含许多内容。 To define things like startOfRecord , we need to disable the dreaded monomorphism restriction. 要定义像startOfRecord这样的东西,我们需要禁用可怕的单态限制。 If we want to write type signatures for anything like startOfRecord we also need to enable FlexibleContexts . 如果我们想为startOfRecord类的东西编写类型签名,我们还需要启用FlexibleContexts We also need to import parsec, Control.Applicative , and Numeric (readHex) 我们还需要导入parsec, Control.ApplicativeNumeric (readHex)

{-# LANGUAGE NoMonomorphismRestriction #-}
{-# LANGUAGE FlexibleContexts #-}

import Text.Parsec
import Control.Applicative
import Numeric (readHex)

I'm also going to use Word8 and Word16 from Data.Word since they exactly match the types used in intel hex records. 我还打算使用Word8Word16Data.Word因为严丝合缝Intel十六进制记录中使用的类型。

import Data.Word

Ignoring the recordData for a momement, we can define how to read hex values for bytes ( Word8 ) and 16 bit integer addresses ( Word16 ). 忽略recordData ,我们可以定义如何读取字节( Word8 )和16位整数地址( Word16 )的十六进制值。

hexWord8 :: (Stream s m Char) => ParsecT s u m Word8
hexWord8 = toHexValue <$> count 2 hexDigit

hexWord16 :: (Stream s m Char) => ParsecT s u m Word16
hexWord16 = toHexValue <$> count 4 hexDigit

toHexValue :: (Num a, Eq a) => String -> a
toHexValue = fst . head . readHex

This lets us define all of the pieces except for recordData . 这让我们可以定义除recordData之外的所有部分。

startOfRecord = char ':'
byteCount = hexWord8
address = hexWord16
recordType = hexWord8
checksum = hexWord8

Leaving out recordData , we can now write something like your line in Applicative style. 离开recordData ,我们现在可以在Applicative样式中写出类似你的line Application in Applicative style is written as <*> ( . is function composition or composition in Category s ). Applicative风格中的Applicative被写为<*>.Category s中的功能组合或组成 )。

line = _ <$> startOfRecord <*> byteCount <*> address <*> recordType <*> checksum

The compiler will tell us about the type of the hole _ . 编译器会告诉我们关于孔_的类型。 It says 它说

    Found hole `_'
      with type: Char -> Word8 -> Word16 -> Word8 -> Word8 -> b

If we had a function with that type, we could use it here and make a ParserT that reads something like a record, but still missing the recordData . 如果我们有与该类型的功能,我们可以在这里使用它,并作出ParserT读取就像一个纪录,但仍不失其recordData We'll make a data type to hold all of an intel hex record except for the actual data. 我们将创建一个数据类型来保存除实际数据之外的所有英特尔十六进制记录。

data IntelHexRecord = IntelHexRecord Word8 Word16 Word8 {- [Word8] -} Word8

If we drop this into line (with const to discard the startOfRecord ) 如果我们把它放到line (用const来丢弃startOfRecord

line = const IntelHexRecord <$> startOfRecord <*> byteCount <*> address <*> recordType <*> checksum

the compiler will tell us that the type of line is a parser for our pseudo- IntelHexRecord . 编译器会告诉我们line的类型是伪IntelHexRecord的解析器。

*> :t line
line :: Stream s m Char => ParsecT s u m IntelHexRecord

This is as far as we can go with Applicative style. 这是我们可以使用Applicative样式。 Let's define how to read the recordData assuming we already somehow know the byteCount . 让我们来定义如何读取recordData假设我们已经在某种程度上知道byteCount

recordData :: (Stream s m Char) => Word8 -> ParsecT s u m [Word8]
recordData c = count (fromIntegral c) hexWord8

We'll also modify IntelHexRecord to have a place to hold the data. 我们还将修改IntelHexRecord以保存数据。

data IntelHexRecord = IntelHexRecord Word8 Word16 Word8 [Word8] Word8

If you have an Applicative f , there's no way, in general, to choose the structure based on the contents. 如果你有一个Applicative f ,一般来说,根据内容选择结构是没有办法的。 That's the big difference between an Applicative and a Monad ; 这是ApplicativeMonad之间的巨大差异; a Monad 's bind, (>>=) :: forall a b. ma -> (a -> mb) -> mb Monad的绑定, (>>=) :: forall a b. ma -> (a -> mb) -> mb (>>=) :: forall a b. ma -> (a -> mb) -> mb , allows you to choose the structure based on the contents. (>>=) :: forall a b. ma -> (a -> mb) -> mb ,允许您根据内容选择结构。 This is exactly what we need to do to determine how to read the recordData based on the result we obtained earlier by reading the byteCount . 这正是我们需要做的recordData根据我们之前通过读取byteCount获得的结果来确定如何读取byteCount

The easiest way to use one bind >>= in the definition of line is to switch entirely to Monad ic style and do -notation. line的定义中使用一个绑定>>=的最简单方法是完全切换到Monad ic样式和do -notation。

line = do
    startOfRecord
    bc   <- byteCount
    addr <- address
    rt   <- recordType
    rd   <- recordData bc
    cs   <- checksum
    return $ IntelHexRecord bc addr rt rd cs

As far as my understanding goes, the limitation of Applicative Parsers (compared with Monadic Parsers) is that you are limited to parsing context-free expressions. 就我的理解而言,Applicative Parsers(与Monadic Parsers相比)的局限性在于你只能解析无上下文的表达式。

By this I mean that decisions about how to parse at a certain point cannot depend on values parsed before, only on the structure (ie a parser failed, so we try to apply a different one). 我的意思是,关于如何在某一点解析的决定不能依赖于之前解析的 ,只取决于结构(即解析器失败,因此我们尝试应用不同的值)。

I find that this can be explained from the operators themselves: 我发现这可以从运营商自己解释:

(<*>) :: Applicative f => f (a -> b) -> f a -> f b
(>>=) :: Monad m => m a -> (a -> m b) -> m b

For <*> you can see that everything takes place at the level of the values 'contained in' the Applicative whereas for >>= the value can be used to influence the containing structure. 对于<*>您可以看到所有内容都发生在'Applicative'中包含的值的级别,而对于>>=该值可用于影响包含结构。 This is precicely what makes Monads more powerful than Applicatives. 这正是使Monads比Applicative更强大的原因。

For your Problem this means that you nedd to use a monadic parser to stick all the individual pieces together, appoximaly like this: 对于你的问题,这意味着你需要使用monadic解析器将所有单个部分粘在一起,如下所示:

parseRecord = do
  count <- byteCount
  ...
  rData <- recordData count
  ...
  return (count,rData,...)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM