如何从XML输出显示非ASCII字符

Question

I get this output in a XML element: 我在XML元素中得到以下输出：

&#xA3;111.00

It should be £111.00 . 应该是£111.00 。

How can i sort this out so that all unicode characters are displayed rather than the code. 我如何解决这个问题，以便显示所有unicode字符而不是代码。 I am using linux tool wget to fetch the xml file from the Internet. 我正在使用linux工具wget从Internet上获取xml文件。 Perhaps some sort of convertor? 也许某种转换器？

I am viewing the file in putty , i am parsing the file and i want to clean the input before parsing. 我正在查看腻子中的文件，正在解析文件，并且想在解析之前清除输入。

I am using xml_grep2 to get the elements i want and then cat filename | 我正在使用xml_grep2来获取所需的元素，然后使用cat文件名| while read ..... 同时阅读.....

Answer 1

You can use HTML::Entities to replace the entities with literal character codes. 您可以使用HTML :: Entities用文字字符代码替换实体。 I don't know how good its coverage is, though. 我不知道它的覆盖范围有多好。 There are bound to be similar tools for other languages if you are not comfortable with Perl. 如果您对Perl不满意，那么肯定会有其他语言的类似工具。 http://metacpan.org/pod/HTML::Entities http://metacpan.org/pod/HTML ::实体

sh$ echo '&#xA3;111.00' | perl -CSD -MHTML::Entities -pe 'decode_entities($_)'
£111.00

This won't work if the HTML::Entities module is not installed. 如果未安装HTML :: Entities模块，则此方法将无效。 If you need to install it, there are numerous tutorials about the CPAN on the Internet. 如果需要安装它，Internet上有许多有关CPAN的教程。

Edit : Add usage example. 编辑：添加用法示例。 The -CSD option might not be necessary on your system, but on OSX at least, I got garbage output without it. -CSD选项在您的系统上可能不是必需的，但是至少在OSX上，我没有它就得到了垃圾输出。

Answer 2

Ok i'm going to close this question now. 好的，我现在要结束这个问题。

After parsing the file with xml_grep2 i was able to get a clean output however was seeing this Ã character in the file. 用xml_grep2解析文件后，我可以得到干净的输出，但是在文件中看到这个Ã字符。 I changed putty settings for character set to UTF-8 from ISO-8859 to resolve that. 我将字符集的腻子设置从ISO-8859更改为UTF-8，以解决该问题。

如何从XML输出显示非ASCII字符

问题描述

2 个解决方案

解决方案1
0 2011-09-05 15:32:24

解决方案2
0 2011-09-06 11:23:11

如何从XML输出显示非ASCII字符

问题描述

2 个解决方案

解决方案1 0 2011-09-05 15:32:24

解决方案2 0 2011-09-06 11:23:11

解决方案1
0 2011-09-05 15:32:24

解决方案2
0 2011-09-06 11:23:11