I'm trying to parse the HTML page of wikipedia using Nokogiri on Ruby(2.5.1) on Ubuntu, here is my code and what my terminal says back:
Any ideas on where the problem comes from ? I tried bundle install just before but nothing seems to work. Thanks in advance for any help brought !
require 'open-uri'
require 'nokogiri'
page = Nokogiri::HTML(open('https://en.wikipedia.org'))
puts page # => Nokogiri::HTML::Document```
```asus@asus-X75VD:~/THP/jour8/lib$ ruby test8.rb
Nokogiri::HTML::Document```
Just so you know, Nokogiri has a command-line equivalent that lets you retrieve a page and play with it in IRB so you don't have to mess with writing code until you're ready to. If you enter:
nokogiri https://en.wikipedia.org
at the terminal prompt you'll drop into IRB and be able to do something like:
irb(main):002:0> @doc.to_s[0..10]
=> "<!DOCTYPE h"
or:
irb(main):005:0> @doc.to_s.size
=> 76139
You can view the page, write it to disk, all the normal things.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.