简体   繁体   English

如何从XML输出显示非ASCII字符

[英]How to display non-ASCII characters from a XML output

I get this output in a XML element: 我在XML元素中得到以下输出:

£111.00

It should be £111.00 . 应该是£111.00

How can i sort this out so that all unicode characters are displayed rather than the code. 我如何解决这个问题,以便显示所有unicode字符而不是代码。 I am using linux tool wget to fetch the xml file from the Internet. 我正在使用linux工具wget从Internet上获取xml文件。 Perhaps some sort of convertor? 也许某种转换器?

I am viewing the file in putty , i am parsing the file and i want to clean the input before parsing. 我正在查看腻子中的文件,正在解析文件,并且想在解析之前清除输入。

I am using xml_grep2 to get the elements i want and then cat filename | 我正在使用xml_grep2来获取所需的元素,然后使用cat文件名| while read ..... 同时阅读.....

You can use HTML::Entities to replace the entities with literal character codes. 您可以使用HTML :: Entities用文字字符代码替换实体。 I don't know how good its coverage is, though. 我不知道它的覆盖范围有多好。 There are bound to be similar tools for other languages if you are not comfortable with Perl. 如果您对Perl不满意,那么肯定会有其他语言的类似工具。 http://metacpan.org/pod/HTML::Entities http://metacpan.org/pod/HTML ::实体

sh$ echo '£111.00' | perl -CSD -MHTML::Entities -pe 'decode_entities($_)'
£111.00    

This won't work if the HTML::Entities module is not installed. 如果未安装HTML :: Entities模块,则此方法将无效。 If you need to install it, there are numerous tutorials about the CPAN on the Internet. 如果需要安装它,Internet上有许多有关CPAN的教程。

Edit : Add usage example. 编辑 :添加用法示例。 The -CSD option might not be necessary on your system, but on OSX at least, I got garbage output without it. -CSD选项在您的系统上可能不是必需的,但是至少在OSX上,我没有它就得到了垃圾输出。

Ok i'm going to close this question now. 好的,我现在要结束这个问题。

After parsing the file with xml_grep2 i was able to get a clean output however was seeing this à character in the file. 用xml_grep2解析文件后,我可以得到干净的输出,但是在文件中看到这个Ã字符。 I changed putty settings for character set to UTF-8 from ISO-8859 to resolve that. 我将字符集的腻子设置从ISO-8859更改为UTF-8,以解决该问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM