[英]Aeson does not decode strings with unicode characters
I'm trying to use Data.Aeson ( https://hackage.haskell.org/package/aeson-0.6.1.0/docs/Data-Aeson.html ) to decode some JSON strings, however it is failing to parse strings that contain non-standard characters. 我正在尝试使用Data.Aeson( https://hackage.haskell.org/package/aeson-0.6.1.0/docs/Data-Aeson.html )解码一些JSON字符串,但是它无法解析那些包含非标准字符。
As an example, the file: 例如,文件:
import Data.Aeson
import Data.ByteString.Lazy.Char8 (pack)
test1 :: Maybe Value
test1 = decode $ pack "{ \"foo\": \"bar\"}"
test2 :: Maybe Value
test2 = decode $ pack "{ \"foo\": \"bòz\"}"
When run in ghci, gives the following results: 在ghci中运行时,得到以下结果:
*Main> :l ~/test.hs
[1 of 1] Compiling Main ( /Users/ltomlin/test.hs, interpreted )
Ok, modules loaded: Main.
*Main> test1
Just (Object fromList [("foo",String "bar")])
*Main> test2
Nothing
Is there a reason that it doesn't parse the String with the unicode character? 有没有理由不解析具有Unicode字符的String? I was under the impression that Haskell was pretty good with unicode.
我的印象是Haskell的unicode相当不错。 Any suggestions would be greatly appreciated!
任何建议将不胜感激!
Thanks, 谢谢,
tetigi 特蒂吉
Upon further investigation using eitherDecode
, I get the following error message: 在使用
eitherDecode
进一步调查eitherDecode
,我收到以下错误消息:
*Main> test2
Left "Failed reading: Cannot decode byte '\\x61': Data.Text.Encoding.decodeUtf8: Invalid UTF-8 stream"
x61
is the unicode character for 'z', which comes right after the special unicode character. x61
是'z'的Unicode字符, x61
在特殊Unicode字符之后。 Not sure why it's failing to read the characters after the special character! 不确定为什么在特殊字符之后无法读取字符!
Changing test2
to be test2 = decode $ pack "{ \\"foo\\": \\"bòz\\"}"
instead gives the error: 将
test2
更改为test2 = decode $ pack "{ \\"foo\\": \\"bòz\\"}"
会产生错误:
Left "Failed reading: Cannot decode byte '\\xf2': Data.Text.Encoding.decodeUtf8: Invalid UTF-8 stream"
Which is the character for "ò", which makes a bit more sense. 这是“ò”的字符,这更有意义。
The problem is your usage of pack from the Char8 module, which doesn't work with non-Latin 1 data. 问题是您使用了Char8模块中的pack,不适用于非Latin 1数据。 Instead, use
encodeUtf8
from text. 而是使用文本中的
encodeUtf8
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.