简体   繁体   English

http中的Ruby搜索获取请求响应正文

[英]Ruby search in http get request response body

I'am making a get request in ruby like; 我正在用红宝石发出获取请求,例如;

    require 'net/http'
    require 'uri'

    uri = URI.parse("https://www.test.com")
    request = Net::HTTP::Get.new(uri)
    request.content_type = "application/json"
    request["Accept"] = "application/json"

    req_options = {
      use_ssl: uri.scheme == "https",
    }

    response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
      http.request(request)
    end

    # response.code
    response.body

This is a html source return plain text. 这是一个html源返回纯文本。 I would like to search for some id element on this return and get its value. 我想在此返回中搜索一些id元素并获取其值。 It seems as a crawler. 好像是爬行者。 but I have never written one. 但我从来没有写过

For instance, there is a field like; 例如,有一个字段;例如:

<div id='price'>1000€</div>

I would like to search for <div id='price'> and get 1000€. 我想搜索<div id='price'>并获得1000欧元。

I can only get its index. 我只能得到它的索引。 But then do not know what should i do. 但是然后不知道我该怎么办。

Is it possible ? 可能吗 ? or is there any other way? 或还有其他方法吗?

Thank you 谢谢

You probably want to use https://github.com/sparklemotion/nokogiri gem. 您可能要使用https://github.com/sparklemotion/nokogiri gem。

Nokogiri (鋸) is a Rubygem providing HTML, XML, SAX, and Reader parsers with XPath and CSS selector support. Nokogiri(锯)是一种Rubygem,提供具有XPath和CSS选择器支持的HTML,XML,SAX和Reader解析器。

require 'nokogiri'

html = <<HTML
<div id="block1">
    <a href="http://google.com">link1</a>
</div>
<div id="block2">
    <a href="http://stackoverflow.com">link2</a>
    <a id="tips">just a bookmark</a>
</div>
HTML

doc = Nokogiri::HTML(html)
doc.css('#block1 a[href]').text
#=>link1

To modify your example: 修改示例:

require 'net/http'
require 'uri'
require 'nokogiri'
uri = URI.parse("https://www.example.com")
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
request["Accept"] = "application/json"

req_options = {
  use_ssl: uri.scheme == "https",
}

response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
  http.request(request)
end

response.body

doc = Nokogiri::HTML.parse(response.body)

doc.css('p').text;
#=> "This domain is established to be used for illustrative examples in documents. You may use this\n    domain in examples without prior coordination or asking for permission.More information..."

In Ruby we have Nokogiri, which lets you search documents via XPath or CSS3 selectors: 在Ruby中,我们有Nokogiri,它使您可以通过XPath或CSS3选择器搜索文档:

doc = Nokogiri::HTML(open("https://www.test.com"))
doc.at_css('div#price').text

or: 要么:

doc = Nokogiri::HTML response.body
doc.at_css('div#price').text

https://github.com/sparklemotion/nokogiri https://github.com/sparklemotion/nokogiri

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM