[英]Haskell/Parsec: how do I use Text.Parsec.Token with Text.Parsec.Indent (from the indents package)
The indents package for Haskell's Parsec provides a way to parse indentation-style languages (like Haskell and Python). Haskell的Parsec的indents包提供了一种解析缩进式语言(如Haskell和Python)的方法。 It redefines the Parser
type, so how do you use the token parser functions exported by Parsec's Text.Parsec.Token
module, which are of the normal Parser
type? 它重新定义了Parser
类型,那么如何使用Parsec的Text.Parsec.Token
模块导出的令牌解析器函数,这些函数是普通的Parser
类型?
Text.ParserCombinators.Parsec.IndentParser
and Text.ParserCombinators.Parsec.IndentParser.Token
IndentParser 0.2.1是一个旧包,提供两个模块Text.ParserCombinators.Parsec.IndentParser
和Text.ParserCombinators.Parsec.IndentParser.Token
Text.Parsec.Indent
Text.Parsec.Indent
0.3.3是一个提供单个模块Text.Parsec.Indent
的新包 Parsec comes with a load of modules . Parsec附带了大量模块 。 most of them export a bunch of useful parsers (eg newline
from Text.Parsec.Char
, which parses a newline) or parser combinators (eg count np
from Text.Parsec.Combinator
, which runs the parser p , n times) 大多导出一堆有用解析器(例如newline
从Text.Parsec.Char
,它解析新行)或解析器组合(例如count np
从Text.Parsec.Combinator
,它运行解析器P,N次)
However, the module Text.Parsec.Token
would like to export functions which are parametrized by the user with features of the language being parsed, so that, for example, the braces p
function will run the parser p after parsing a '{' and before parsing a '}', ignoring things like comments, the syntax of which depends on your language. 但是,模块Text.Parsec.Token
想要导出由用户参数化的函数和正在解析的语言的特征,因此,例如, braces p
函数将在解析'{'后运行解析器p在解析'}'之前,忽略注释之类的东西,其语法取决于您的语言。
The way that Text.Parsec.Token
achieves this is that it exports a single function makeTokenParser
, which you call, giving it the parameters of your specific language (like what a comment looks like) and it returns a record containing all of the functions in Text.Parsec.Token
, adapted to your language as specified. Text.Parsec.Token
实现这一点的方式是它导出一个函数makeTokenParser
,你调用它,给它你特定语言的参数(就像注释的样子),它返回一个包含所有函数的记录。 Text.Parsec.Token
,适合您指定的语言。
Of course, in an indentation-style language, these would need to be adapted further (perhaps? here's where I'm not sure – I'll explain in a moment) so I note that the (presumably obsolete) IndentParser package provides a module Text.ParserCombinators.Parsec.IndentParser.Token
which looks to be a drop-in replacement for Text.Parsec.Token
. 当然,在缩进式语言中,这些需要进一步调整(也许?这里是我不确定的地方 - 我稍后会解释)所以我注意到(可能是过时的)IndentParser包提供了一个模块Text.ParserCombinators.Parsec.IndentParser.Token
,它看起来是Text.Parsec.Token
替代Text.Parsec.Token
。
I should mention at some point that all the Parsec parsers are monadic functions, so they do magic things with state so that error messages can say at what line and column in the source file the error appeared 我应该在某些时候提到所有的Parsec解析器都是monadic函数,所以它们用状态做神奇的事情,这样错误消息可以说出源文件中的哪一行和哪一行出现了错误
For a couple of small reasons it appears to me that the indents package is more-or-less the current version of IndentParser, however it does not provide a module that looks like Text.ParserCombinators.Parsec.IndentParser.Token
, it only provides Text.Parsec.Indent
, so I am wondering how one goes about getting all the token parsers from Text.Parsec.Token
(like reserved "something"
which parses the reserved keyword "something", or like braces
which I mentioned earlier). 由于一些小的原因,在我看来,缩进包或多或少是当前版本的IndentParser,但是它没有提供看起来像Text.ParserCombinators.Parsec.IndentParser.Token
的模块,它只提供Text.Parsec.Indent
,所以我想知道如何从Text.Parsec.Token
获取所有令牌解析器 (如reserved "something"
解析保留关键字“某事”,或者像我之前提到的那些braces
)。
It would appear to me that (the new) Text.Parsec.Indent
works by some sort of monadic state magic to work out at what column bits of source code are, so that it doesn't need to modify the token parsers like whiteSpace
from Text.Parsec.Token
, which is probably why it doesn't provide a replacement module. 在我看来,(新的) Text.Parsec.Indent
通过某种Text.Parsec.Indent
状态魔法来处理源代码的哪些列位,因此它不需要修改像whiteSpace
那样的令牌解析器Text.Parsec.Token
,这可能是它没有提供替换模块的原因。 But I am having a problem with types. 但是我遇到类型问题。
You see, without Text.Parsec.Indent
, all my parsers are of type Parser Something
where Something is the return type and Parser
is a type alias defined in Text.Parsec.String as 你看,没有Text.Parsec.Indent
,我所有的解析器都是Parser Something
,其中Something是返回类型, Parser
是Text.Parsec.String中定义的类型别名
type Parser = Parsec String ()
but with Text.Parsec.Indent
, instead of importing Text.Parsec.String
, I use my own definition 但是使用Text.Parsec.Indent
,我使用自己的定义,而不是导入Text.Parsec.String
type Parser a = IndentParser String () a
which makes all my parsers of type IndentParser String () Something
, where IndentParser is defined in Text.Parsec.Indent. 这使我的所有解析器类型为IndentParser String () Something
,其中IndentParser在Text.Parsec.Indent中定义。 but the token parsers that I'm getting from makeTokenParser
in Text.Parsec.Token
are of the wrong type. 但是我从makeTokenParser
中的Text.Parsec.Token
获取的令牌解析器的类型错误。
If this isn't making much sense by now, it's because I'm a bit lost. 如果现在这没有多大意义,那是因为我有点失落。 The type issue is discussed a bit here . 这里讨论了类型问题。
The error I'm getting is that I've tried replacing the one definition of Parser
above with the other, but then when I try to use one of the token parsers from Text.Parsec.Token
, I get the compile error 我得到的错误是我尝试用另一个替换上面的Parser
的一个定义,但是当我尝试使用Text.Parsec.Token
一个令牌解析器时,我得到了编译错误
Couldn't match expected type `Control.Monad.Trans.State.Lazy.State
Text.Parsec.Pos.SourcePos'
with actual type `Data.Functor.Identity.Identity'
Expected type: P.GenTokenParser
String
()
(Control.Monad.Trans.State.Lazy.State Text.Parsec.Pos.SourcePos)
Actual type: P.TokenParser ()
Sadly, neither of the examples above use token parsers like those in Text.Parsec.Token. 遗憾的是,上面的示例都没有像Text.Parsec.Token中那样使用令牌解析器。
It sounds like you want to have your parsers defined everywhere as being of type 听起来你想让你的解析器在任何地方被定义为类型
Parser Something
(where Something is the return type) and to make this work by hiding and redefining the Parser
type which is normally imported from Text.Parsec.String
or similar. (其中Something是返回类型)并通过隐藏和重新定义通常从Text.Parsec.String
或类似方法导入的Parser
类型来实现此功能。 You still need to import some of Text.Parsec.String
, to make Stream an instance of a monad; 您仍然需要导入一些Text.Parsec.String
,以使Stream成为monad的实例; do this with the line: 用这条线做到这一点:
import Text.Parsec.String ()
Your definition of Parser
is correct. 您对Parser
定义是正确的。 Alternatively and equivalently (for those following the chat in the comments) you can use 或者等效地(对于那些在评论中聊天的人)你可以使用
import Control.Monad.State
import Text.Parsec.Pos (SourcePos)
type Parser = ParsecT String () (State SourcePos)
and possibly do away with the import Text.Parsec.Indent (IndentParser)
in the file in which this definition appears. 并且可能在显示此定义的文件中import Text.Parsec.Indent (IndentParser)
。
Your problem is that you're looking at the wrong part of the compiler error message. 您的问题是您正在查看编译器错误消息的错误部分。 You're focusing on 你专注于
Couldn't match expected type `State SourcePos' with actual type `Identity'
when you should be focusing on 当你应该专注于
Expected type: P.GenTokenParser ...
Actual type: P.TokenParser ...
Where you "import" parsers from Text.Parsec.Token
, what you actually do, of course (as you briefly mentioned) is first to define a record your language parameters and then to pass this to the function makeTokenParser
, which returns a record containing the token parsers. 你从Text.Parsec.Token
“导入”解析器的Text.Parsec.Token
,你实际做的当然(正如你简要提到的)首先定义一个记录你的语言参数,然后将它传递给函数makeTokenParser
,它返回一个包含的记录令牌解析器。
You must therefore have some lines that look something like this: 因此,您必须有一些看起来像这样的行:
import qualified Text.Parsec.Token as P
beetleDef :: P.LanguageDef st
beetleDef =
haskellStyle {
parameters, parameters etc.
}
lexer :: P.TokenParser ()
lexer = P.makeTokenParser beetleDef
... but a P.LanguageDef st
is just a GenLanguageDef String st Identity
, and a P.TokenParser ()
is really a GenTokenParser String () Identity
. ...但是P.LanguageDef st
只是GenLanguageDef String st Identity
,而P.TokenParser ()
实际上是GenTokenParser String () Identity
。
You must change your type declarations to the following: 您必须将类型声明更改为以下内容:
import Control.Monad.State
import Text.Parsec.Pos (SourcePos)
import qualified Text.Parsec.Token as P
beetleDef :: P.GenLanguageDef String st (State SourcePos)
beetleDef =
haskellStyle {
parameters, parameters etc.
}
lexer :: P.GenTokenParser String () (State SourcePos)
lexer = P.makeTokenParser beetleDef
... and that's it! ......就是这样! This will allow your "imported" token parsers to have type ParsecT String () (State SourcePos) Something
, instead of Parsec String () Something
(which is an alias for ParsecT String () Identity Something
) and your code should now compile. 这将允许您的“导入”令牌解析器具有类型ParsecT String () (State SourcePos) Something
,而不是Parsec String () Something
(它是ParsecT String () Identity Something
的别名),您的代码现在应该编译。
(For maximum generality, I'm assuming that you might be defining the Parser
type in a file separate from, and imported by, the file in which you define your actual parser functions. Hence the two repeated import
statements.) (为了最大限度的通用性,我假设您可能在一个文件中定义Parser
类型,该文件与您定义实际解析器函数的文件分开并导入。因此,这两个重复的import
语句。)
Many thanks to Daniel Fischer for helping me with this. 非常感谢Daniel Fischer帮我解决这个问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.