[英]Parsing tcl arrays in ruby with treetop
I have a bunch of data in (what i think is) a tcl array. 我在(我认为是)tcl数组中有一堆数据。 Basically it's in the form of {a {bc} d {ef} g}
. 基本上是{a {bc} d {ef} g}
。 It's only nested one deep, but isn't always nested, that is to say, a
may just be a
or it may be {aa bb}
or possibly {}
, but never {aa {bb cc}}
. 它仅嵌套一个深层,但并不总是嵌套的,也就是说, a
可能只是a
,也可能是{aa bb}
或可能是{}
,但从来没有{aa {bb cc}}
。 I want to extract this array so I can use it in ruby. 我想提取此数组,以便可以在ruby中使用它。
My first thought was, "No problem, I'll write a little grammar to parse this." 我的第一个想法是,“没问题,我会写一些语法来解析它。” I installed the treetop gem, and wrote up a parser, which seemed to work just fine. 我安装了树梢上的宝石,并编写了一个解析器,看起来工作正常。 I started having problems when I tried to extract an array from the parsed tree. 当我尝试从解析的树中提取数组时,我开始遇到问题。 I would like to better understand the cause of the problems and what I am doing wrong. 我想更好地了解问题的原因以及我做错了什么。
Here is my parser code so far: (tcl_array.treetop) 到目前为止,这是我的解析器代码:(tcl_array.treetop)
grammar TCLArray
rule array
"{" [\s]* "}" {
def content
[]
end
}
/
"{" [\s]* array_element_list [\s]* "}" {
def content
array_element_list.content
end
}
end
rule array_element_list
array_element {
def content
[array_element.content]
end
}
/
array_element [\s]+ array_element_list {
def content
[array_element.content] + array_element_list.content
end
}
end
rule array_element
[^{}\s]+ {
def content
return text_value
end
}
/
array {
def content
array.content
end
}
end
end
Invoking p.parse("{a}").content
yields tcl_array.rb:99:in 'content': undefined local variable or method 'array_element'
调用p.parse("{a}").content
生成tcl_array.rb:99:in 'content': undefined local variable or method 'array_element'
The first term in array_element_list (array_element) says that array_element is an undefined local variable, but accessor methods are supposed to be automatically defined according to the treetop documentation. array_element_list(array_element)中的第一项表示array_element是未定义的局部变量,但是应该根据树梢文档自动定义访问器方法。
Earlier, I tried a solution that was based off of a grammar with fewer but slightly more complicated rules: 早些时候,我尝试了一种基于语法的解决方案,该语法具有较少但稍微复杂的规则:
grammar TCLArray
rule array
"{" ([\s]* array_element ([\s]+ array_element)* )? [\s]* "}"
end
rule array_element
[^{}\s]+ / array
end
end
But with this grammar I had issues where the parser seemed to be creating several different expressions for the array rule even though it did not use any alternative expressions (/). 但是对于这种语法,我遇到了一个问题,即解析器似乎没有为数组规则创建几个不同的表达式,即使它不使用任何替代表达式(/)。 The result was that I couldn't figure out how to access the various bits of the array rule to return them as a ruby array. 结果是我无法弄清楚如何访问数组规则的各个位以将它们作为ruby数组返回。
Maybe a parser generator is overkill in this case. 在这种情况下,解析器生成器可能会过大。 Here's a simple hand-rolled recursive-descent parser based on this JSON parser by James Edward Gray II : 这是一个基于James Edward Gray II的JSON解析器的简单的递归递归下降解析器:
#!/usr/bin/env ruby
# based on James Edward Gray II's solution to the Parsing JSON
# Ruby Quiz #155: <http://RubyQuiz.Com/quiz155.html>
require 'strscan'
class TclArrayParser < StringScanner
def parse
parse_value
ensure
eos? or error "Unexpected data: '#{rest}'"
end
private
def parse_value
trim_space
parse_string or parse_array
ensure
trim_space
end
def parse_array
return nil unless scan(/\{\s*/)
array = []
while contents = parse_value
array << contents
end
scan(/\}/) or error('Unclosed array')
array
end
def parse_string
scan(/[^{}[:space:]]+/)
end
def trim_space
skip(/\s*/)
end
def error(message)
pos = if eos? then 'end of input' else "position #{self.pos}" end
raise ParseError, "#{message} at #{pos}"
end
class ParseError < StandardError; end
end
Here's a testsuite: 这是一个测试套件:
require 'test/unit'
class TestTclArrayParser < Test::Unit::TestCase
def test_that_an_empty_string_parses_to_nil
assert_nil TclArrayParser.new('').parse
end
def test_that_a_whitespace_string_parses_to_nil
assert_nil TclArrayParser.new(" \t \n ").parse
end
def test_that_an_empty_array_parses_to_an_empty_array
assert_equal [], TclArrayParser.new('{}').parse
end
def test_that_an_empty_array_with_whitespace_at_the_front_parses_to_an_empty_array
assert_equal [], TclArrayParser.new(' {}').parse
end
def test_that_an_empty_array_with_whitespace_at_the_end_parses_to_an_empty_array
assert_equal [], TclArrayParser.new('{} ').parse
end
def test_that_an_empty_array_with_whitespace_inside_parses_to_an_empty_array
assert_equal [], TclArrayParser.new('{ }').parse
end
def test_that_an_empty_array_surrounded_by_whitespace_parses_to_an_empty_array
assert_equal [], TclArrayParser.new(' {} ').parse
end
def test_that_an_empty_array_with_whitespace_at_the_front_and_inside_parses_to_an_empty_array
assert_equal [], TclArrayParser.new(' { }').parse
end
def test_that_an_empty_array_with_whitespace_at_the_end_and_inside_parses_to_an_empty_array
assert_equal [], TclArrayParser.new('{ } ').parse
end
def test_that_an_empty_array_surrounded_by_whitespace_with_whitespace_inside_parses_to_an_empty_array
assert_equal [], TclArrayParser.new(' { } ').parse
end
def test_that_a_sole_element_parses
assert_equal 'a', TclArrayParser.new('a').parse
end
def test_that_an_array_with_one_element_parses
assert_equal ['a'], TclArrayParser.new('{a}').parse
end
def test_that_a_nested_array_parses
assert_equal [[]], TclArrayParser.new('{{}}').parse
end
def test_that_a_nested_array_with_one_element_parses
assert_equal [['a']], TclArrayParser.new('{{a}}').parse
end
def test_that_whitespace_is_ignored
assert_equal [], TclArrayParser.new(' { } ').parse
end
def test_that_complex_arrays_parse_correctly
assert_equal ['a', %w[b c], 'd', %w[e f], 'g'], TclArrayParser.new('{a {b c} d {e f} g}').parse
assert_equal [%w[aa bb], %w[b c], 'd', %w[e f], 'g'], TclArrayParser.new('{{aa bb} {b c} d {e f} g}').parse
assert_equal [[], %w[b c], 'd', %w[e f], 'g'], TclArrayParser.new('{{} {b c} d {e f} g}').parse
assert_equal [[], ['b', 'c'], 'd', ['e', 'f'], 'g'], TclArrayParser.new("\n{\n{\n}\n{\nb\nc\n}\nd\n{\ne\nf\n}\ng\n}\n").parse
end
end
Noting this for reference, but I've just released a gem for parsing simple TCL. 注意到这一点仅供参考,但我刚刚发布了一个用于解析简单TCL的gem。
https://github.com/julik/tickly https://github.com/julik/tickly
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.