简体   繁体   English

如何使用 Nokogiri 和 XPath 获取特定的 XML 节点

[英]How to get a specific XML node with Nokogiri and XPath

I have this structure in XML:我在 XML 中有这个结构:

<resource id="2023984310000103605" name="Rebelezza">
      <prices>
         <price datefrom="2019-10-31" dateto="2019-12-31" price="2690.0" currency="EUR" />
         <price datefrom="2020-01-01" dateto="2020-03-31" price="2690.0" currency="EUR" />
         <price datefrom="2020-03-31" dateto="2020-04-30" price="3200.0" currency="EUR" />
      </prices>                   
      <products>
         <product name="specific-product1">
            <prices>
               <price datefrom="2019-10-31" dateto="2019-12-31" price="2690.0" currency="EUR" />
               <price datefrom="2020-01-01" dateto="2020-03-31" price="2690.0" currency="EUR" />
               <price datefrom="2020-03-31" dateto="2020-04-30" price="3200.0" currency="EUR" />              
            </prices>
         </product>
      </products>
</resource>

How can I get only the prices under resources without getting the prices inside products using an XPath selector.如何使用 XPath 选择器仅获取资源下的价格而不获取产品内部的价格。

At the moment, I have something like:目前,我有类似的东西:

resources = resourcesParsed.xpath("//resource")
for resource in resources do
  prices = resource.xpath(".//prices/price[number(translate(@dateto, '-', '')) >= 20190101]")
end

However, I am getting both, the prices directly under resource element and also under products.但是,我得到了直接在资源元素下的价格和产品下的价格。 I'm not interested in the prices under products.我对产品下的价格不感兴趣。

2 options with XPath: XPath 的 2 个选项:

.//price[parent::prices[parent::resource]]
.//price[ancestor::*[2][name()="resource"]]

Output: 3 nodes Output:3个节点

And to add a date condition, you can use what you did:要添加日期条件,您可以使用您所做的:

.//price[parent::prices[parent::resource]][translate(@dateto, '-', '') >= 20200101]

I'd do it this way:我会这样做:

require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
<resource>
      <prices>
         <price price="1"/>
      </prices>                   
      <products>
         <product>
            <prices>
               <price price="-1"/>
            </prices>
         </product>
      </products>
</resource>
EOT

doc.search('resource > prices > price').map { |p| p['price'] }
# => ["1"]

This won't find price nodes under products or product because it wasn't specified in the selector, which, in CSS-ese means "find the resource node then the prices node then the price nodes".这不会在productsproduct下找到price节点,因为它没有在选择器中指定,这在 CSS-ese 中的意思是“找到资源节点,然后是价格节点,然后是价格节点”。 Anything not in that path is ignored.不在该路径中的任何内容都将被忽略。

The majority of time I find CSS selectors easier to write, understand, and less visually noisy.大多数时候,我发现 CSS 选择器更易于编写、理解并且视觉上的噪音更少。 Even the Nokogiri docs recommend using CSS for those reasons.由于这些原因,甚至 Nokogiri 文档也推荐使用 CSS。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM