http中的Ruby搜索获取请求响应正文

Question

I'am making a get request in ruby like; 我正在用红宝石发出获取请求，例如；

    require 'net/http'
    require 'uri'

    uri = URI.parse("https://www.test.com")
    request = Net::HTTP::Get.new(uri)
    request.content_type = "application/json"
    request["Accept"] = "application/json"

    req_options = {
      use_ssl: uri.scheme == "https",
    }

    response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
      http.request(request)
    end

    # response.code
    response.body

This is a html source return plain text. 这是一个html源返回纯文本。 I would like to search for some id element on this return and get its value. 我想在此返回中搜索一些id元素并获取其值。 It seems as a crawler. 好像是爬行者。 but I have never written one. 但我从来没有写过

For instance, there is a field like; 例如，有一个字段；例如：

<div id='price'>1000€</div>

I would like to search for <div id='price'> and get 1000€. 我想搜索<div id='price'>并获得1000欧元。

I can only get its index. 我只能得到它的索引。 But then do not know what should i do. 但是然后不知道我该怎么办。

Is it possible ? 可能吗？ or is there any other way? 或还有其他方法吗？

Thank you 谢谢

Answer 1

You probably want to use https://github.com/sparklemotion/nokogiri gem. 您可能要使用https://github.com/sparklemotion/nokogiri gem。

Nokogiri (鋸) is a Rubygem providing HTML, XML, SAX, and Reader parsers with XPath and CSS selector support. Nokogiri（锯）是一种Rubygem，提供具有XPath和CSS选择器支持的HTML，XML，SAX和Reader解析器。

require 'nokogiri'

html = <<HTML
<div id="block1">
    <a href="http://google.com">link1</a>
</div>
<div id="block2">
    <a href="http://stackoverflow.com">link2</a>
    <a id="tips">just a bookmark</a>
</div>
HTML

doc = Nokogiri::HTML(html)
doc.css('#block1 a[href]').text
#=>link1

To modify your example: 修改示例：

require 'net/http'
require 'uri'
require 'nokogiri'
uri = URI.parse("https://www.example.com")
request = Net::HTTP::Get.new(uri)
request.content_type = "application/json"
request["Accept"] = "application/json"

req_options = {
  use_ssl: uri.scheme == "https",
}

response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
  http.request(request)
end

response.body

doc = Nokogiri::HTML.parse(response.body)

doc.css('p').text;
#=> "This domain is established to be used for illustrative examples in documents. You may use this\n    domain in examples without prior coordination or asking for permission.More information..."

Answer 2

In Ruby we have Nokogiri, which lets you search documents via XPath or CSS3 selectors: 在Ruby中，我们有Nokogiri，它使您可以通过XPath或CSS3选择器搜索文档：

doc = Nokogiri::HTML(open("https://www.test.com"))
doc.at_css('div#price').text

or: 要么：

doc = Nokogiri::HTML response.body
doc.at_css('div#price').text

https://github.com/sparklemotion/nokogiri https://github.com/sparklemotion/nokogiri

http中的Ruby搜索获取请求响应正文

问题描述

2 个解决方案

解决方案1
0 2019-03-24 20:49:44

解决方案2
0 2019-03-24 21:01:26

http中的Ruby搜索获取请求响应正文

问题描述

2 个解决方案

解决方案1 0 2019-03-24 20:49:44

解决方案2 0 2019-03-24 21:01:26

解决方案1
0 2019-03-24 20:49:44

解决方案2
0 2019-03-24 21:01:26