简体   繁体   English

Nokogiri XML到节点

[英]Nokogiri XML to node

I'm reading a local HTML document with Nokogiri like so: 我正在阅读Nokogiri的本地HTML文档,如下所示:

f = File.open(local_xml)
@doc = Nokogiri::XML(f)
f.close

@doc contains a Nokogiri XML object that I can parse using at_css . @doc包含一个Nokogiri XML对象,我可以使用at_css解析at_css

I want to modify it using Nokogiri's XML::Node , and I'm absolutely stuck. 我想用Nokogiri的XML :: Node修改它,我绝对卡住了。 How do I take this Nokogiri XML document and work with it using node methods? 如何使用此Nokogiri XML文档并使用节点方法使用它?

For example: 例如:

@doc.at_css('rates tr').add_next_sibling(element)

returns: 收益:

undefined method `add_next_sibling' for nil:NilClass (NoMethodError)

despite the fact that @doc.class is Nokogiri::XML::Document . 尽管@doc.classNokogiri::XML::Document

For completeness, here is the markup I'm trying to edit. 为了完整起见,这是我正在尝试编辑的标记。

<html>
<head>
<title>Exchange Rates</title>
    <link rel="stylesheet" href="style.css">
</head>
<body>
    <table class="rates">
        <tr>
            <td class="up"><div></div></td>
            <td class="date">Saturday, Jan 12</td>
            <td class="rate up">3.83</td>
        </tr>
        <tr>
            <td class="up"><div></div></td>
            <td class="date">Friday, Jan 11</td>
            <td class="rate up">3.70</td>
        </tr>
        <tr>
            <td class="down"><div></div></td>
            <td class="date">Thursday, Jan 10</td>
            <td class="rate down">3.68</td>
        </tr>
        <tr>
            <td class="down"><div></div></td>
            <td class="date">Wedensday, Jan 9</td>
            <td class="rate down">3.70</td>
        </tr>
        <tr>
            <td class="up"><div></div></td>
            <td class="date">Tuesday, Jan 8</td>
            <td class="rate up">3.66</td>
        </tr>
    </table>
</body>
</html>

Try to load as HTML instead of XML Nokogiri::HTML(f) 尝试加载为HTML而不是XML Nokogiri::HTML(f)

Not getting in much detail on how Nokogiri works, lets say that XML does not have css right? 没有详细介绍Nokogiri如何工作,让我们说XML没有css对吗? So the method at_css doesn't make sense (maybe it does I dunno). 所以方法at_css没有意义(也许我不知道)。 So it should work loading as Html. 所以它应该作为Html加载。

Update 更新

Just noticed one thing. 刚注意到一件事。 You want to do at_css('.rates tr') insteand of at_css('rates tr') because that's how you select a class in css. 你想做at_css('.rates tr') in_css at_css('rates tr')因为这就是你在css中选择一个类的方法。 Maybe it works with XML now. 也许它现在适用于XML。

This is an example how to do what you are trying to do. 这是一个如何做你想做的事情的例子。 Starting with f containing a shortened version of the HTML you want to parse: 从包含要解析的HTML的缩短版本的f开始:

require 'nokogiri'

f = '
<html>
<head>
<title>Exchange Rates</title>
    <link rel="stylesheet" href="style.css">
</head>
<body>
    <table class="rates">
        <tr>
            <td class="up"><div></div></td>
            <td class="date">Saturday, Jan 12</td>
            <td class="rate up">3.83</td>
        </tr>
    </table>
</body>
</html>
'

doc = Nokogiri::HTML(f)
doc.at('.rates tr').add_next_sibling('<p>foobar</p>')

puts doc.to_html

Your code is incorrectly trying to find the class="rates" parameter for <table> . 您的代码错误地尝试查找<table>class="rates"参数。 In CSS we'd use .rates . 在CSS中我们使用.rates An alternate way to do it using CSS is table[class="rates"] . 使用CSS的另一种方法是table[class="rates"]

Your example didn't define the node you were trying to add to the HTML, so I appended <p>foobar</p> . 您的示例未定义您尝试添加到HTML的节点,因此我附加了<p>foobar</p> Nokogiri will let you build a node from scratch and append it, or use markup and add that, or you could find a node from one place in the HTML, remove it, and then insert it somewhere else. Nokogiri将允许您从头开始构建节点并附加它,或使用标记并添加它,或者您可以从HTML中的一个位置找到节点,将其删除,然后将其插入其他位置。

That code outputs: 该代码输出:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Exchange Rates</title>
<link rel="stylesheet" href="style.css">
</head>
<body>
    <table class="rates">
<tr>
<td class="up"><div></div></td>
            <td class="date">Saturday, Jan 12</td>
            <td class="rate up">3.83</td>
        </tr>
<p>foobar</p>
</table>
</body>
</html>

It's not necessary to use at_css or at_xpath instead of at . 没有必要使用at_cssat_xpath而不是at Nokogiri senses what type of accessor you're using and handles it. Nokogiri感知您正在使用的访问器类型并处理它。 The same applies using xpath or css instead of search . 使用xpathcss代替search Also, at is equivalent to search('some accessor').first , so it finds the first occurrence of the matching node. 此外, at等同于search('some accessor').first ,它找到匹配节点的第一个匹配项。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM