简体   繁体   中英

How to view HTML on terminal using Nokogiri on Ruby?

I'm trying to parse the HTML page of wikipedia using Nokogiri on Ruby(2.5.1) on Ubuntu, here is my code and what my terminal says back:

Any ideas on where the problem comes from ? I tried bundle install just before but nothing seems to work. Thanks in advance for any help brought !

require 'open-uri'
require 'nokogiri'

page = Nokogiri::HTML(open('https://en.wikipedia.org'))   
puts page   # => Nokogiri::HTML::Document```



```asus@asus-X75VD:~/THP/jour8/lib$ ruby test8.rb 
Nokogiri::HTML::Document```


Just so you know, Nokogiri has a command-line equivalent that lets you retrieve a page and play with it in IRB so you don't have to mess with writing code until you're ready to. If you enter:

nokogiri https://en.wikipedia.org

at the terminal prompt you'll drop into IRB and be able to do something like:

irb(main):002:0> @doc.to_s[0..10]
=> "<!DOCTYPE h"

or:

irb(main):005:0> @doc.to_s.size
=> 76139

You can view the page, write it to disk, all the normal things.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM