简体   繁体   English

如何获取 javascript 生成的锚点 href 的文本值?

[英]How to get text value of an anchor href generated by javascript?

While scraping an extranet platform (with login so I cannot put the link here) I met this guys:在抓取一个外网平台时(需要登录,所以我不能把链接放在这里)我遇到了这些人:

<td body>
<tr goto="javascript:prddetailrech(651438,'')" style="cursor:pointer;"><td valign="top" id="colREF"><a href="javascript:prddetailrech(651438,'');">C002</a></td>

This is the first of many and I need this reference: COO2.这是许多中的第一个,我需要这个参考:COO2。

This is my code, after entering the platform with selenium I use nokogiri (only because I know it better)这是我的代码,在使用 selenium 进入平台后,我使用 nokogiri(只是因为我更了解它)

driver.get 'http://riviera.prescripteurs.axessia.net/common/code/b2c/prd_b2c.asp?prd_PageSize=200'
doc = Nokogiri::HTML(driver.page_source)
appartments = []
rows = doc.css('tbody tr')
sleep 7
rows.each do |row|
  ref = row.css('colREF nobr a href').text
  appartment_info = {
    reference: ref
  }
  p appartments << appartment_info

And I get many {:reference=>""} in an array.我在一个数组中得到了很多 {:reference=>""} 。 Any insight on how I could get this value with nokogiri or selenium (ruby)?关于如何使用 nokogiri 或 selenium(红宝石)获得此值的任何见解? I would appreciate any feedback.我将不胜感激任何反馈。

row.css('colREF nobr a href') is not the correct CSS selector. row.css('colREF nobr a href')不是正确的 CSS 选择器。

colREF is an ID , not an HTML element , so you would access it with #colREF . colREF是一个ID ,而不是 HTML元素,因此您可以使用#colREF访问它。

href is an attribute , not an HTML element . href是一个属性,而不是 HTML元素 You can't extract attribute values directly with the CSS selector but can use Nokogiri's .attributes method instead:您不能直接使用 CSS 选择器提取属性值,但可以使用 Nokogiri 的.attributes方法代替:

ref = row.css('#colREF a').attributes["href"].value

note I would recommend touching up on CSS selectors, see https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors注意我会建议修改 CSS 选择器,请参阅https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM