简体   繁体   中英

Encoding::UndefinedConversionError: “\xE4” from ASCII-8BIT to UTF-8

I tried to fetch this CSV-File with Net::HTTP .

File.open(file, "w:UTF-8") do |f|
  content = Net::HTTP.get_response(URI.parse(url)).body
  f.write(content)
end

After reading my local csv file again, i got some weird output.

Nationalit\\xE4t;Alter 0-5

I tried to encode it to UTF-8, but got the error Encoding::UndefinedConversionError: "\\xE4" from ASCII-8BIT to UTF-8

The rchardet gem tolds me the content is ISO-8859-2 . But convert to UTF-8 will not work.

After open it in a normal Texteditor, i see it normal encoded.

You can go with force_encoding :

require 'net/http'

url = "http://data.linz.gv.at/katalog/population/abstammung/2012/auslg_2012.csv"
File.open('output', "w:UTF-8") do |f|
  content = Net::HTTP.get_response(URI.parse(url)).body
  f.write(content.force_encoding("UTF-8"))
end

But this will make you lose some acentuation in your .cvs file

If you are deadly sure that you always will use this URL as input, and the file will always keep this encoding, you can do

# encoding: utf-8
require 'net/http'

url = "http://data.linz.gv.at/katalog/population/abstammung/2012/auslg_2012.csv"
File.open('output', "w:UTF-8") do |f|
  content = Net::HTTP.get_response(URI.parse(url)).body
  f.write(content.encode("UTF-8", "ISO-8859-15"))
end

But this will only work to this file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM