简体   繁体   English

使用 nokogiri 解析 xml 文件

[英]Parse xml file with nokogiri

I need to parse a xml file on ruby on rails, i'm using nokogiri gem to parse it.我需要在 ruby​​ on rails 上解析一个 xml 文件,我正在使用 nokogiri gem 来解析它。

I can parse like this to appear the parent and his children, but its appearing like this:我可以这样解析以显示父母和他的孩子,但它看起来像这样:

PARENT: Example Parent 1

CHILD: Example Children 1Example Children 2Example Children 3

PARENT: Example Parent 2

CHILD:

Why it's missing the children of the second parent node?为什么它缺少第二个父节点的子节点? If I call the array with a for each, it appears all the children.如果我用一个 for each 来调用数组,它会出现所有的孩子。 I did this like this:我是这样做的:

In the controller:在控制器中:

  @codes = []
    doc.xpath('//Node').each do |parent| 
       @parentN =parent.xpath('///ancestor::*/@name')


      @codes << parent.xpath('Node/@name').text

    end

And the view:和观点:

<% for x in 0...@parentN.count %>

    <p> PARENT: <%= @parentN[x]  %>  </p>

  <p> CHILD:  <%= @codes[x] %>  </p>

    <%   end %>

How can I "connect" the parent with the childs?如何将父母与孩子“联系起来”? Presenting the parent and his children, and then other parent and childrens...呈现父母和他的孩子,然后是其他父母和孩子的......

This is my xml file:这是我的 xml 文件:

   <Report>
       <Node name="Example Parent 1" color="red">
          <Node name="Example Children 1" color="red" rank="very high" />
          <Node name="Example Children 2" color="red" rank="high" />
          <Node name="Example Children 3" color="yellow" rank="moderate" />
       </Node>
       <Node name="Example Parent 2" color="yellow">
          <Node name="Example Children 1" color="yellow" rank="moderate" />
       </Node>
    </Report>

Problem #1问题#1

In this line:在这一行:

       @parentN =parent.xpath('///ancestor::*/@name')

you override the previous value of @parentN .您覆盖了@parentN的先前值。

Problem #2问题#2

By running通过跑步

<% for x in 0...@parentN.count %>

You will be getting 2 values for a single valued array.您将获得单值数组的 2 个值。 .count is equivalent to the last index +1 (for an array with only [0] .count is 1. Your @parentN is assigned to an object .count相当于最后一个索引 +1(对于只有[0] .count为 1 的数组。你的@parentN被分配给一个对象

Recommendation (simple)推荐(简单)

Use a single array to hold the nested values (as a hash) rather than two variables.使用单个数组来保存嵌套值(作为散列)而不是两个变量。

#xmlController.rb
@codes = []
doc.xpath('Report/Node').each do |parent| 
  @codes << { parent.xpath('@name') => parent.xpath('Node').map { |child| child.text }
end



#show.html.erb

<% @codes.each do |parent, children| %>
  <p> PARENT: <%= @parent  %>  </p>
  <p> CHILDREN:  <%= @children.each { |child| p child } %>  </p>

Recommendation based on comments below基于以下评论的建议

The above was shown to demonstrate the simpilest way to think about the problem.上面的内容展示了最简单的思考问题的方式。 Now that we are ready to parse all the data in the node, we need to change our xpath and our map.现在我们已经准备好解析节点中的所有数据,我们需要更改我们的 xpath 和我们的地图。 The doc.xpath('Report/Node') is used to select the parent node, and that can stay the same. doc.xpath('Report/Node')用于选择父节点,并且可以保持不变。 We will want to set the @codes key to the actual value of the string embedded in the Node which is not parent.xpath('@name') but actually parent.xpath('@name')[0].value .我们希望将@codes键设置为嵌入在 Node 中的字符串的实际值,它不是parent.xpath('@name')而是实际上parent.xpath('@name')[0].value There could be multiple xml representations of nodes with the attribute 'name' and we want the first ( [0] ) one.具有属性“名称”的节点可能有多个 xml 表示,我们想要第一个 ( [0] )。 The value of the name attribute is returned using the .value method. name 属性的值使用.value方法返回。

Make a class so the nodes become objects创建一个类,使节点成为对象

Your Parent node has a name and a color and your children have name, color, and rank.您的父节点具有名称和颜色,而您的子节点具有名称、颜色和等级。 It looks like you have a model for Node that looks like:看起来您有一个 Node 模型,如下所示:

class Node
  include ActiveModel::Model
  attr_accessor :name, :color, :rank, :children
end

I'm simplifying things by not using persistence here, but you may want to save your records to disk, and if you do look into the slew of things ActiveRecord does on RailsGuides我在这里不使用持久性来简化事情,但是您可能希望将记录保存到磁盘,并且如果您确实研究了 ActiveRecord 在RailsGuides所做的大量事情

Now when we go through the xml document, we will create an array of objects rather than the hash of strings (which both happen to be objects, but I'll leave that quandry for you to check out).现在,当我们浏览 xml 文档时,我们将创建一个对象数组,而不是字符串的散列(它们都碰巧是对象,但我将把这个难题留给您检查)。

Parse the Xpath to get attributes of Node Objects解析 Xpath 以获取节点对象的属性

A quick way to set the name and color attributes of the parent looks like this:设置父级的名称和颜色属性的快速方法如下所示:

@node = Node.new(doc.xpath('Report/Node').first.attributes.inject({}) { |attrs, value| attrs[value[0].to_sym] = value[1].value; attrs })

OK, so maybe that wasn't all that easy.好吧,也许这并不容易。 What we do is take the Enumerable result of the XPath, navigate to the first attributes and make a hash of string attribute names (name, color, rank) and their corresponding values.我们所做的是获取 XPath 的 Enumerable 结果,导航到第一个属性并生成字符串属性名称(名称、颜色、等级)及其对应值的散列。 Once we have the hash we pass it to our Node class' new method to instanciate (create) a node.一旦我们有了哈希,我们就将它传递给我们的 Node 类的新方法来实例化(创建)一个节点。 This will pass us an object that we can use:这将传递给我们一个我们可以使用的对象:

@node.name
#=> "Example Parent 1"

Extend the Class for children为孩子们扩展班级

Once we have the parent node, we can give it children, creating new nodes in an array.一旦我们有了父节点,我们就可以给它子节点,在数组中创建新节点。 To facilitate this, we extend the definition of the model to include an overridden initializer (new()).为方便起见,我们扩展了模型的定义以包含一个被覆盖的初始化程序 (new())。

class Node
  include ActiveModel::Model
  attr_accessor :name, :color, :rank, :children

  def initialize(*args)
    self.children = []
    super(*args)
  end
end
Adding children 添加子项
doc.xpath('Report/Node').each do |parent|
  node = Node.new(parent.attributes.inject({}) { |attrs, value| attrs[value[0].to_sym] = value[1].value; attrs }))
  node.children = parent.xpath('Node').map do |child|
    Node.new(child.attributes.inject({}) { |attrs, value| attrs[value[0].to_sym] = value[1].value; attrs }))
  end
end

We can automate this process now that we know how to create a Node object using .first and a child of it using .first with the previous enumeration.既然我们知道如何使用.first创建一个 Node 对象,并使用.first和之前的枚举创建它的子对象,我们就可以自动化这个过程。

 doc.xpath('Report/Node').each do |parent| node = Node.new(parent.attributes.inject({}) { |attrs, value| attrs[value[0].to_sym] = value[1].value; attrs })) node.children = parent.xpath('Node').map do |child| Node.new(child.attributes.inject({}) { |attrs, value| attrs[value[0].to_sym] = value[1].value; attrs })) end end

Ugly controller code丑陋的控制器代码

Move it to the model 将其移至模型

But Wait!可是等等! That isn't very DRY!那不是很干! Let's move the logic that hurts our eyes to look at into the model to make it easier to work with.让我们将伤害我们眼睛的逻辑移到模型中,使其更易于使用。

 class Node include ActiveModel::Model attr_accessor :name, :color, :rank, :children def initialize(*args) self.children = [] super(*args) end def self.new_from_xpath(xml_node) self.new(xml_node.attributes.inject({}) { |attrs, value| attrs[value[0].to_sym] = value[1].value; attrs }) end end

Final controller最终控制器

Now the controller looks like this:现在控制器看起来像这样:

 @nodes = [] doc.xpath('Report/Node').each do |parent| node = Node.new_from_xpath(parent) node.children = parent.xpath('Node').map do |child| Node.new_from_xpath(child) end @nodes << node end

Using this in the view在视图中使用它

In the view you can use the @nodes like this:在视图中,您可以像这样使用@nodes:

 <% for @node in @nodes %> Parent: <%= @node.name %> Children: <% for @child in @node.children %> <%= @child.name %> is <%= @child.color %> <% end %> <% end %>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM