<Item id="item0">
<Links>
<FirstLink id="link1" target="one"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content</String>
</Data>
</Item>
<Item id="item1">
<Links>
<FirstLink id="link1" target="two"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content</String>
</Data>
</Item>
I have created a Nokogiri-NodeSet with this structure, ie a list of items with links and data children. How can I filter any items that don't match a certain value in the 'target'-attribute of <FirstLink>
?
Actually, what I want in the end is to extract the <Data><String>
-Content of every <Item>
that matches a certain value in it's <FirstLink>
"Target"-Attribute.
I've tried several approaches already but I'm at a loss as to how to identify an element by an attribute of it's grandchild, then extracting the content of this grandchild's parent's sibling, X(.
I completely didn't understand what your goal is. But using a guess, I am trying to show you, how to proceed in this case :
require 'nokogiri'
doc = Nokogiri::XML <<-xml
<Item id="item0">
<Links>
<FirstLink id="link1" target="one"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content1</String>
</Data>
</Item>
<Item id="item1">
<Links>
<FirstLink id="link1" target="two"/>
<SecondLink id="link2" target="two"/>
</Links>
<Data>
<String>content2</String>
</Data>
</Item>
xml
#xpath
method with the expression "//Item"
, will select all the Item
nodes. Then those Item
nodes will be passed to the #reject
method to select only those nodes, that has a node called Links
having the target
attribute value is "one"
. If any of the links, either FirstLink
or SecondLink
has the target
attribute value "one"
, for that nodes grandparent node Item
will be selected.
node.at("//Links/FirstLink")['target']
will give you the string say "one"
which is a value of target attribute of the node, FirstLink of first Item nodes , then "two"
from the second Item node. The part ['any vaue']
in node.at("//Links/FirstLink")['target']['any vaue']
is a call to the String#[]
method.
Remember below approach will give you the flexibility of the use regular expression too.
nodeset = doc.xpath("//Item").reject do |node|
node.at("//Links/FirstLink")['target']['any vaue']
end
Now nodeset
contains only the required Item
nodes. Now I use #map , passing each item node inside it to collect the content of the String
node. Then #at
method with an expression //Data/String
, will select the String
node. Then #text
, will give you the content of each String node.
nodeset.map { |n| n.at('//Data/String').text } # => ["content1"]
We can build up an XPath expression to do this. Assuming we are starting from the whole XML document, rather than the node-set you already have, something like
//Item
will select all <Item>
elements (I'm guessing you already have something like that to get this node-set).
Next, to select only those <Item>
elements which have <Links><FirstLink>
where FirstLink
has a target
attribute value of one
:
//Item[Links/FirstLink[@target='one']]
and finally to select the Data/String
children of those nodes:
//Item[Links/FirstLink[@target='one']]/Data/String
So with Nokogiri you could use something like this (where doc
is your parsed document):
doc.xpath("//Item[Links/FirstLink[@target='one']]/Data/String")
or if you want to use the node-set you already have you can use a relative expression:
nodeset.xpath("self::Item[Links/FirstLink[@target='one']]/Data/String")
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.