[英]Making BBcode parser with PEG problem
I am making bbcode parser with PEG (Citrus implementation for Ruby) and I am stuck on parsing this [b]sometext[anothertext[/b]
我正在用PEG(Ruby的Citrus实现)制作bbcode解析器,但我坚持解析此
[b]sometext[anothertext[/b]
There is code 有代码
grammar BBCodeParser
rule document
(open_tag | close_tag | new_line | text)*
end
rule open_tag
("[" tag_name "="? tag_data? "]")
end
rule close_tag
("[/" tag_name "]")
end
rule text
[^\n\[\]]+
end
rule new_line
("\r\n" | "\n")
end
rule tag_name
# [p|br|b|i|u|hr|code|quote|list|url|img|\*|color]
[a-zA-Z\*]+
end
rule tag_data
([^\[\]\n])+
end
end
Problem is with rule text
I dont know how to say, that text can contain everything except \\r, \\n, open_tag or close_tag. 问题是规则
text
我不知道怎么说,该文本可以包含\\ r,\\ n,open_tag或close_tag以外的所有内容。 With this implementation it fail on example because of exclude of [ and ] (thats wrong) 在此实现中,由于排除了[和]而导致示例失败(那是错误的)
So finaly question is how to do rule, that can match anything except \\r, \\n or exact match of open_tag or close_tag 所以最后一个问题是如何做规则,该规则可以匹配\\ r,\\ n或open_tag或close_tag的完全匹配项之外的任何内容
If you have solution for another PEG implementation, give it there too. 如果您有其他PEG实施的解决方案,也请在此处提供。 I can switch :)
我可以切换:)
I've encountered a similar problem just a while ago. 不久前,我遇到了类似的问题。 There is a trick to do this:
有一个技巧可以做到这一点:
You need to say match open_tag
, followed by everything that is not a closing tag and then closing_tag
. 您需要先说说match
open_tag
,然后说不是结束标记的所有内容,然后是closing_tag
。 So this gives the following rule 所以这给出了以下规则
rule tag
open_tag ((!open_tag | !close_tag | !new_line ) .)+ close_tag
end
This would parse any text and continue recursively when the [
wasn't the beginning of another tag. 这将解析任何文本,并在
[
不是另一个标签的开头时递归继续。
rule text
[^\n\[\]]+ (!open_tag text)?
end
This 这个
rule text
[^\n\[\]]+ (!open_tag text)?
end
ends up with Parse Error 以解析错误结束
I tried to continue with this idea and result was ([^\\n] (!open_tag | !close_tag) text*)
But it will fail too. 我试图继续这个想法,结果是
([^\\n] (!open_tag | !close_tag) text*)
但它也会失败。 It will match "sometext[anothertext[/b]"
它将匹配
"sometext[anothertext[/b]"
Find temp solution ((!open_tag | !close_tag | !new_line) .)
It will find just one letter by one letter, but ignore all open and close tags. 查找临时解决方案
((!open_tag | !close_tag | !new_line) .)
它将只查找一个字母一个字母,但忽略所有打开和关闭标签。 These letters i can join together later :) 这些信件我以后可以在一起:)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.