简体   繁体   中英

Encoding error in content get from open-uri in ruby on rails

In some cases when I use open to get a web page in Ruby the content of the page has an encoding error. Example:

open("http://www.google.com.br").read

Chars like ç and ã are replaced by ?

How can I get the right chars?

this seems to work:

require 'iconv'
i = Iconv.new('UTF-8','LATIN1')
i.iconv(open('http://google.com.br').read)

Running Ruby 1.9.2 here. Your code yields HTML which contains words like this:

Configura\xE7\xF5es

So on my work machine at least (Vista, using Windows CMD console), it returns HTML escaped characters.

Also, as far as I know, Ruby 1.9.2 is "almost" fully Unicode compliant, so I am guessing you shouldn't have UTF-8 issues unless your console cannot handle printing UTF-8 characters.

Hope that helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM