簡體   English   中英

如何使用ruby讀取源代碼,該源代碼是在我自己的計算機上托管的?

[英]How do I use ruby to read the source a uri that im hosting on my own computer?

我試圖讓紅寶石讀取托管在我自己的計算機上的url的來源。 我嘗試將open-uri gem與:

source = open('http://127.0.0.1:8000/wikipedia_en_all_nopic_01_2012/A/Mick%20Jagger.html', &:read)

使用普通的外部URL可以正常工作,但是當我嘗試訪問計算機上托管的URL時會引發多個錯誤。 有誰知道如何做到這一點? 以下是命令行錯誤報告:

/Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http/response.rb:357:in `finish': incorrect header check (Zlib::DataError)
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http/response.rb:357:in `finish'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http/response.rb:262:in `ensure in inflater'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http/response.rb:262:in `inflater'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http/response.rb:274:in `read_body_0'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http/response.rb:201:in `read_body'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:328:in `block (2 levels) in open_http'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http.rb:1415:in `block (2 levels) in transport_request'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http/response.rb:162:in `reading_body'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http.rb:1414:in `block in transport_request'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http.rb:1405:in `catch'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http.rb:1405:in `transport_request'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http.rb:1378:in `request'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:319:in `block in open_http'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/net/http.rb:853:in `start'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:313:in `open_http'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:723:in `buffer_open'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:210:in `block in open_loop'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:208:in `catch'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:208:in `open_loop'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:149:in `open_uri'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:703:in `open'
from /Users/rorycampbell/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/open-uri.rb:34:in `open'
from testurl.rb:6:in `<main>'

更新:我正在使用kiwix服務器托管URL

嘗試改用net/http

require 'net/http'
source = Net::HTTP.get URI.parse('http://127.0.0.1:8000/wikipedia_en_all_nopic_01_2012/A/Mick%20Jagger.html')

為什么要使用本地計算機上的URL? 為什么不僅僅給出路徑?

該錯誤聽起來像是您嘗試解析的實際文件或它的服務方式有問題。 通過閱讀有關Kiwix服務器的信息,聽起來像是后者。在Kiwix網站上,它說它使用了一種稱為openzim的類型壓縮方法,這很可能是為什么open-uri無法找到解析它的方法。

您可以嘗試nokogiri,看看它在解析時是否有問題。 但是,由於您似乎要嘗試在ruby中打開/操作zim文件,因此我會尋找一個供ruby使用的zim庫,而不是嘗試提供它。

此處: https : //github.com/chrisistuff/zim-ruby

我從沒處理過奇異果/奇異果,所以我不知道這種奇果是否可行,但它是Google搜索“奇異果紅寶石”的唯一方法。

我對奇異果有同樣的問題。 我將所有URL提取到名為hrefs.txt的文件中(在我的情況下是德國項目gutenberg),並使用wget下載了每個URL:

f = File.open("hrefs.txt", "r")
f.each_line do |url|
  #filename = url.split("/").last.gsub!(/[^A-Za-z]/, '')[0..-4]
  system "wget #{url}"
end
f.close

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM