[英]Nokogiri Node Set
I am trying to use Nokogiri to scrape a web page. 我正在尝试使用Nokogiri抓取网页。 Right now, I am able to set a variable links to the following on a web page: 现在,我可以在网页上设置指向以下内容的变量链接:
links = page.css('.item_inner')
and links is a: 和链接是:
Nokogiri::XML::NodeSet
Then I iterate through this NodeSet(links): 然后,我遍历此NodeSet(链接):
links.each{|link| puts link.css('.details a')}
In order to get some more information. 为了获得更多信息。 But now the method above's class is now a: 但是现在上面的类的方法现在是:
Fixnum
and returns a list of (I'm not sure exactly what they are returning but it looks like a list of these: 并返回一个列表(我不确定它们到底要返回什么,但看起来像是这些列表:
<a se:clickable:target="true" href="/nyc/sale/1056207-coop-150-sullivan-street-soho-new-york?featured=1">150 Sullivan Street #34</a>
Now I know that there are key/value pairs within this but I am unable to access them at this point. 现在,我知道其中包含键/值对,但是目前无法访问它们。 How can I access say the href here and the actual name? 我该如何访问这里的href和实际名称?
Once you have a single link as a node, its href is link['href']
and so forth, and the link text ("150 Sullivan Street") is its content
. 将单个链接作为节点后,其href就是link['href']
,依此类推,而链接文本(“ 150 Sullivan Street”)就是其content
。
NOTE: A css
search always yields what is effectively an array of found nodes (actually a NodeSet). 注意: css
搜索总是产生有效的发现节点数组(实际上是NodeSet)。 If you are quite sure that there is only one of something to be found by your search, you can skip past that by using at_css
instead, thus yielding a single node. 如果您确定只能通过搜索找到某项内容,则可以使用at_css
跳过该at_css
,从而产生一个节点。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.