简体   繁体   中英

Ruby search in http get request response body

I'am making a get request in ruby like;

    require 'net/http'
    require 'uri'

    uri = URI.parse("https://www.test.com")
    request = Net::HTTP::Get.new(uri)
    request.content_type = "application/json"
    request["Accept"] = "application/json"

    req_options = {
      use_ssl: uri.scheme == "https",
    }

    response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
      http.request(request)
    end

    # response.code
    response.body

This is a html source return plain text. I would like to search for some id element on this return and get its value. It seems as a crawler. but I have never written one.

For instance, there is a field like;

<div id='price'>1000€</div>

I would like to search for <div id='price'> and get 1000€.

I can only get its index. But then do not know what should i do.

Is it possible ? or is there any other way?

Thank you

You probably want to use https://github.com/sparklemotion/nokogiri gem.

Nokogiri (鋸) is a Rubygem providing HTML, XML, SAX, and Reader parsers with XPath and CSS selector support.

require 'nokogiri'

html = <<HTML
<div id="block1">
    <a href="http://google.com">link1</a>
</div>
<div id="block2">
    <a href="http://stackoverflow.com">link2</a>
    <a id="tips">just a bookmark</a>
</div>
HTML

doc = Nokogiri::HTML(html)
doc.css('#block1 a[href]').text
#=>link1

To modify your example:

require 'net/http'
require 'uri'
require 'nokogiri'
uri = URI.parse("https://www.example.com")
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
request["Accept"] = "application/json"

req_options = {
  use_ssl: uri.scheme == "https",
}

response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
  http.request(request)
end

response.body

doc = Nokogiri::HTML.parse(response.body)

doc.css('p').text;
#=> "This domain is established to be used for illustrative examples in documents. You may use this\n    domain in examples without prior coordination or asking for permission.More information..."

In Ruby we have Nokogiri, which lets you search documents via XPath or CSS3 selectors:

doc = Nokogiri::HTML(open("https://www.test.com"))
doc.at_css('div#price').text

or:

doc = Nokogiri::HTML response.body
doc.at_css('div#price').text

https://github.com/sparklemotion/nokogiri

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM