[英]How do you parse an Intel Hex Record with applicative functors using the haskell parsec library?
I would like to parse an Intel Hex Record with parsec using the applicative functor style. 我想用parsec使用applicative functor样式解析Intel Hex Record。 A typical records looks like the following:
典型记录如下所示:
:10010000214601360121470136007EFE09D2190140 :10010000214601360121470136007EFE09D2190140
The first character is always ':', the next two characters are a hex string representing the number of bytes in the record. 第一个字符始终为':',接下来的两个字符是十六进制字符串,表示记录中的字节数。 The next four characters are a hex string identifying the start address of the data.
接下来的四个字符是一个十六进制字符串,用于标识数据的起始地址。 I had code like the following, but I don't know how to applicatively pass the byte count to the parser that parses the data bytes.
我有类似下面的代码,但我不知道如何应用程序将字节数传递给解析数据字节的解析器。 My non-working code looks like the following.
我的非工作代码如下所示。
line = startOfRecord . byteCount . address . recordType . recordData . checksum
startOfRecord = char ':'
byteCount = toHexValue <$> count 2 hexDigit
address = toHexValue <$> count 4 hexDigit
recordType = toHexValue <$> count 2 hexDigit
recordData c = toHexValue <$> count c hexDigit
recordData c CharParser = count c hexDigit
checksum = toHexValue <$> count 2 hexDigit
toHexValue :: String -> Int
toHexValue = fst . head . readHex
Could anyone help me? 谁能帮助我? Thanks.
谢谢。
There are a number of things not included in your question that you need in order to use parsec. 为了使用parsec,您的问题中没有包含许多内容。 To define things like
startOfRecord
, we need to disable the dreaded monomorphism restriction. 要定义像
startOfRecord
这样的东西,我们需要禁用可怕的单态限制。 If we want to write type signatures for anything like startOfRecord
we also need to enable FlexibleContexts
. 如果我们想为
startOfRecord
类的东西编写类型签名,我们还需要启用FlexibleContexts
。 We also need to import parsec, Control.Applicative
, and Numeric (readHex)
我们还需要导入parsec,
Control.Applicative
和Numeric (readHex)
{-# LANGUAGE NoMonomorphismRestriction #-}
{-# LANGUAGE FlexibleContexts #-}
import Text.Parsec
import Control.Applicative
import Numeric (readHex)
I'm also going to use Word8
and Word16
from Data.Word
since they exactly match the types used in intel hex records. 我还打算使用
Word8
和Word16
从Data.Word
因为严丝合缝Intel十六进制记录中使用的类型。
import Data.Word
Ignoring the recordData
for a momement, we can define how to read hex values for bytes ( Word8
) and 16 bit integer addresses ( Word16
). 忽略
recordData
,我们可以定义如何读取字节( Word8
)和16位整数地址( Word16
)的十六进制值。
hexWord8 :: (Stream s m Char) => ParsecT s u m Word8
hexWord8 = toHexValue <$> count 2 hexDigit
hexWord16 :: (Stream s m Char) => ParsecT s u m Word16
hexWord16 = toHexValue <$> count 4 hexDigit
toHexValue :: (Num a, Eq a) => String -> a
toHexValue = fst . head . readHex
This lets us define all of the pieces except for recordData
. 这让我们可以定义除
recordData
之外的所有部分。
startOfRecord = char ':'
byteCount = hexWord8
address = hexWord16
recordType = hexWord8
checksum = hexWord8
Leaving out recordData
, we can now write something like your line
in Applicative
style. 离开
recordData
,我们现在可以在Applicative
样式中写出类似你的line
。 Application in Applicative
style is written as <*>
( .
is function composition or composition in Category
s ). 在
Applicative
风格中的Applicative
被写为<*>
( .
是Category
s中的功能组合或组成 )。
line = _ <$> startOfRecord <*> byteCount <*> address <*> recordType <*> checksum
The compiler will tell us about the type of the hole _
. 编译器会告诉我们关于孔
_
的类型。 It says 它说
Found hole `_'
with type: Char -> Word8 -> Word16 -> Word8 -> Word8 -> b
If we had a function with that type, we could use it here and make a ParserT
that reads something like a record, but still missing the recordData
. 如果我们有与该类型的功能,我们可以在这里使用它,并作出
ParserT
读取就像一个纪录,但仍不失其recordData
。 We'll make a data type to hold all of an intel hex record except for the actual data. 我们将创建一个数据类型来保存除实际数据之外的所有英特尔十六进制记录。
data IntelHexRecord = IntelHexRecord Word8 Word16 Word8 {- [Word8] -} Word8
If we drop this into line
(with const
to discard the startOfRecord
) 如果我们把它放到
line
(用const
来丢弃startOfRecord
)
line = const IntelHexRecord <$> startOfRecord <*> byteCount <*> address <*> recordType <*> checksum
the compiler will tell us that the type of line
is a parser for our pseudo- IntelHexRecord
. 编译器会告诉我们
line
的类型是伪IntelHexRecord
的解析器。
*> :t line
line :: Stream s m Char => ParsecT s u m IntelHexRecord
This is as far as we can go with Applicative
style. 这是我们可以使用
Applicative
样式。 Let's define how to read the recordData
assuming we already somehow know the byteCount
. 让我们来定义如何读取
recordData
假设我们已经在某种程度上知道byteCount
。
recordData :: (Stream s m Char) => Word8 -> ParsecT s u m [Word8]
recordData c = count (fromIntegral c) hexWord8
We'll also modify IntelHexRecord
to have a place to hold the data. 我们还将修改
IntelHexRecord
以保存数据。
data IntelHexRecord = IntelHexRecord Word8 Word16 Word8 [Word8] Word8
If you have an Applicative f
, there's no way, in general, to choose the structure based on the contents. 如果你有一个
Applicative f
,一般来说,根据内容选择结构是没有办法的。 That's the big difference between an Applicative
and a Monad
; 这是
Applicative
和Monad
之间的巨大差异; a Monad
's bind, (>>=) :: forall a b. ma -> (a -> mb) -> mb
Monad
的绑定, (>>=) :: forall a b. ma -> (a -> mb) -> mb
(>>=) :: forall a b. ma -> (a -> mb) -> mb
, allows you to choose the structure based on the contents. (>>=) :: forall a b. ma -> (a -> mb) -> mb
,允许您根据内容选择结构。 This is exactly what we need to do to determine how to read the recordData
based on the result we obtained earlier by reading the byteCount
. 这正是我们需要做的
recordData
根据我们之前通过读取byteCount
获得的结果来确定如何读取byteCount
。
The easiest way to use one bind >>=
in the definition of line
is to switch entirely to Monad
ic style and do
-notation. line
的定义中使用一个绑定>>=
的最简单方法是完全切换到Monad
ic样式和do
-notation。
line = do
startOfRecord
bc <- byteCount
addr <- address
rt <- recordType
rd <- recordData bc
cs <- checksum
return $ IntelHexRecord bc addr rt rd cs
As far as my understanding goes, the limitation of Applicative Parsers (compared with Monadic Parsers) is that you are limited to parsing context-free expressions. 就我的理解而言,Applicative Parsers(与Monadic Parsers相比)的局限性在于你只能解析无上下文的表达式。
By this I mean that decisions about how to parse at a certain point cannot depend on values parsed before, only on the structure (ie a parser failed, so we try to apply a different one). 我的意思是,关于如何在某一点解析的决定不能依赖于之前解析的值 ,只取决于结构(即解析器失败,因此我们尝试应用不同的值)。
I find that this can be explained from the operators themselves: 我发现这可以从运营商自己解释:
(<*>) :: Applicative f => f (a -> b) -> f a -> f b
(>>=) :: Monad m => m a -> (a -> m b) -> m b
For <*>
you can see that everything takes place at the level of the values 'contained in' the Applicative whereas for >>=
the value can be used to influence the containing structure. 对于
<*>
您可以看到所有内容都发生在'Applicative'中包含的值的级别,而对于>>=
该值可用于影响包含结构。 This is precicely what makes Monads more powerful than Applicatives. 这正是使Monads比Applicative更强大的原因。
For your Problem this means that you nedd to use a monadic parser to stick all the individual pieces together, appoximaly like this: 对于你的问题,这意味着你需要使用monadic解析器将所有单个部分粘在一起,如下所示:
parseRecord = do
count <- byteCount
...
rData <- recordData count
...
return (count,rData,...)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.