I am using nokogiri
as my HTML parser.
<html>
<body>
<form>
<table>
<tr><td>Some Text</td></tr>
<tr>
<td colspan="2" align="center">
<br />
<a href="TransportRoom?servlet=CaseSearch.jsp&advancedSearch=Advanced">
Advanced Search
</a>
<br />
</td>
</tr>
</table>
</form>
</body>
</html>
In this html code I want to parse the "Advance Search" link. This html is saved in variable named doc1
Can anyone help me with this?
Should be as simple as
doc = Nokogiri::HTML(doc1)
href = doc.css("a").first.attr('href')
This is what you want?
First answer is working for me but if there is n number of links than we can manipulate it by this way
html = Nokogiri::HTML(doc1)
html.css("a").each do |element|
if (element.text.strip == 'Advanced Search')
advance_search_link = element.attr('href')
end
end
I would do as below :
require 'nokogiri'
@doc = Nokogiri.HTML <<-eotl
<html>
<body>
<form>
<table>
<tr><td>Some Text</td></tr>
<tr>
<td colspan="2" align="center">
<br />
<a href="TransportRoom?servlet=CaseSearch.jsp&advancedSearch=Advanced">
Advanced Search
</a>
<br />
</td>
</tr>
</table>
</form>
</body>
</html>
eotl
@doc.at_xpath("//a[normalize-space(.)='Advanced Search']")['href']
# => "TransportRoom?servlet=CaseSearch.jsp&advancedSearch=Advanced"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.