[英]What's wrong with my Treetop grammar?
I have the grammar file alexa_scrape.tt
: 我有语法文件
alexa_scrape.tt
:
grammar AlexaScrape
rule document
category_listing*
end
rule category_listing
category_line url_line*
end
rule category_line
category "\n"
end
rule category
("/" [^/]+)+
end
rule url_line
[0-9]+ ". " url "\n"
end
rule url
[^\n]*
end
end
I have a ruby file which attempts to make use of it: 我有一个试图利用它的ruby文件:
#!/usr/bin/env ruby -I .
require 'rubygems'
require 'polyglot'
require 'treetop'
require 'alexa_scrape.tt'
parser = AlexaScrapeParser.new
p( parser.parse("") || parser.failure_reason )
p( parser.parse("/x\n") || parser.failure_reason )
But I'm not getting the results I expected: 但我没有得到我预期的结果:
SyntaxNode offset=0, ""
"Expected one of /, \n at line 2, column 1 (byte 4) after /x\n"
It parses the empty string properly (as the trivial match for document
, zero category_listing
s), but fails to parse "/x\\n"
(as the document containing a single category_listing
that itself has zero url_line
s). 它正确解析空字符串(作为
document
的平凡匹配,零category_listing
s),但无法解析"/x\\n"
(因为包含单个category_listing
的文档本身具有零url_line
s)。
What am I doing wrong? 我究竟做错了什么?
It looks like the regex in category
is advancing through the white space needed to match category_line
... do this: 看起来
category
的正则表达式正在通过匹配category_line
所需的空白区域前进...执行此操作:
rule category
("/" [^/\s]+)+ # or perhaps ("/" [^/\n]+)+
end
(And, wow, a Treetop question. This is number 47 in the history of SO and its 4 million total questions. One in 87,000 SO questions are tagged Treetop) . (而且,哇, 一个Treetop问题。这是SO历史上的第47个问题及其400万个问题。在87,000个SO问题中有一个被标记为Treetop) 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.