简体   繁体   English

Ruby Nokogiri 解析 HTML 表三

[英]Ruby Nokogiri Parsing HTML table III

I am using mechanize/nokogiri and need to parse out a HTML with a lot of these tables:我正在使用 mechanize/nokogiri 并且需要用很多这些表解析出 HTML:

<table width="100%" onclick="javascript:abredown('c7a8e8041a5031f127d5d27f3f071cbb');" class="buscaDestaque" bgcolor="#F7D36A">
  <tr>
    <td rowspan="2" scope="col" style="width:5%"><img src="images/gold.gif" border="0"></td>
    <td scope="col" style="width:45%" class="mais"><b>Community - 2nd Season</b><br />Community - 2&ordf; Temporada<br/><b>Downloads: </b> 2496 <b>Comentários: </b>17<br><b>Avaliação: </b> 10/10</td>
    <td scope="col" style="width:20%">28/03/2011 - 21:07</td>
    <td scope="col" style="width:20%"><a href="javascript:abreinfousuario(1083150)">SubsOTF</a></td>
    <td scope="col" style="width:10%"><img src='images/flag_br.gif' border='0'></td>
  </tr>
  <tr>
    <td colspan="4">Release: <span class="brls">Community.S02E19.HDTV.XviD-LOL/DIMENSION</span></td>
  </tr>
</table>

I want this output我想要这个 output

    Community.S02E19.HDTV.XviD-LOL/DIMENSION, ('c7a8e8041a5031f127d5d27f3f071cbb')

Can anyone help me?谁能帮我?

require 'nokogiri'

html = Nokogiri::HTML html_with_many_tables
results = html.css('table.buscaDestaque').map do |table|
  jsid = table['onclick'][/'(\w+)'/,1]
  brls = table.at_css('.brls').text
  "#{brls}, #{jsid}"
end
p results
#=>["Community.S02E19.HDTV.XviD-LOL/DIMENSION, c7a8e8041a5031f127d5d27f3f071cbb",
#=> "AnotherBRLS, anotherJSID"]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM