简体   繁体   中英

Encoding::UndefinedConversionError

I keep getting an Encoding::UndefinedConversionError - "\\xC2" from ASCII-8BIT to UTF-8 every time I try to convert a hash into a JSON string. I tried with [.encode | .force_encoding](["UTF-8" | "ASCII-8BIT" ]) [.encode | .force_encoding](["UTF-8" | "ASCII-8BIT" ]) , chaining .encode with .force_encoding , backwards, switching parameters but nothing seemed to work so I caught the error like this:

begin
  menu.to_json
rescue Encoding::UndefinedConversionError
  puts $!.error_char.dump
  p $!.error_char.encoding
end

Where menu is a sequel's dataset.to_hash with content from a MySQL DB, utf8_general_ci encoding and returned this:

"\\xC2"

<#Encoding:ASCII-8BIT>

The encoding never changes, no matter what .encode / .force_encoding I use. I've even tried to replace the string .gsub!(/\\\\\\xC2/) without luck.

Any ideas?

menu.to_s.encode('UTF-8', invalid: :replace, undef: :replace, replace: '?')

这工作得很好,我不得不替换一些额外的字符,但没有更多的错误。

What do you expect for "\\xC2"? Probably a Â

With ASCII-8BIT you have binary data, and ruby cant decide, what should be.

You must first set the encoding with force_encoding .

You may try the following code:

Encoding.list.each{|enc|
  begin
    print "%-10s\t" % [enc]
    print "\t\xC2".force_encoding(enc)
    print "\t\xC2".force_encoding(enc).encode('utf-8')
  rescue => err
    print "\t#{err}"
  end
  print "\n"
}

The result are the possible values in different encodings for your "\\xC2".

The result may depend on your Output format, but I think you can make a good guess, which encoding you have.

When you defined the encoding you need (probably cp1251) you can

menu.force_encoding('cp1252').to_json

See also Kashyaps comment.

如果你不在乎丢失奇怪的字符,你可以把它们吹走:

str.force_encoding("ASCII-8BIT").encode('UTF-8', undef: :replace, replace: '')

Your auto-accepted solution doesn't work, there are effectively no errors, but it is NOT JSON.

I solved the problem using the oj gem, it now works find. It is also faster than the standard JSON library.

Writting :

   menu_json = Oj.dump menu

Reading :

   menu2 = Oj.load menu_json

https://github.com/ohler55/oj for more details. I hope it will help.

:fallback option can be useful if you know what chars you want to replace

"Text 🙂".encode("ASCII", "UTF-8", fallback: {"🙂" => ":)"})
#=> hello :)

From docs:

Sets the replacement string by the given object for undefined character. The object should be a Hash, a Proc, a Method, or an object which has [] method. Its key is an undefined character encoded in the source encoding of current transcoder. Its value can be any encoding until it can be converted into the destination encoding of the transcoder.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM