[英]How to display non-ASCII characters from a XML output
I get this output in a XML element: 我在XML元素中得到以下输出:
£111.00
It should be £111.00
. 应该是£111.00
。
How can i sort this out so that all unicode characters are displayed rather than the code. 我如何解决这个问题,以便显示所有unicode字符而不是代码。 I am using linux tool wget to fetch the xml file from the Internet. 我正在使用linux工具wget从Internet上获取xml文件。 Perhaps some sort of convertor? 也许某种转换器?
I am viewing the file in putty , i am parsing the file and i want to clean the input before parsing. 我正在查看腻子中的文件,正在解析文件,并且想在解析之前清除输入。
I am using xml_grep2 to get the elements i want and then cat filename | 我正在使用xml_grep2来获取所需的元素,然后使用cat文件名| while read ..... 同时阅读.....
You can use HTML::Entities to replace the entities with literal character codes. 您可以使用HTML :: Entities用文字字符代码替换实体。 I don't know how good its coverage is, though. 我不知道它的覆盖范围有多好。 There are bound to be similar tools for other languages if you are not comfortable with Perl. 如果您对Perl不满意,那么肯定会有其他语言的类似工具。 http://metacpan.org/pod/HTML::Entities http://metacpan.org/pod/HTML ::实体
sh$ echo '£111.00' | perl -CSD -MHTML::Entities -pe 'decode_entities($_)'
£111.00
This won't work if the HTML::Entities module is not installed. 如果未安装HTML :: Entities模块,则此方法将无效。 If you need to install it, there are numerous tutorials about the CPAN on the Internet. 如果需要安装它,Internet上有许多有关CPAN的教程。
Edit : Add usage example. 编辑 :添加用法示例。 The -CSD
option might not be necessary on your system, but on OSX at least, I got garbage output without it. -CSD
选项在您的系统上可能不是必需的,但是至少在OSX上,我没有它就得到了垃圾输出。
Ok i'm going to close this question now. 好的,我现在要结束这个问题。
After parsing the file with xml_grep2 i was able to get a clean output however was seeing this à character in the file. 用xml_grep2解析文件后,我可以得到干净的输出,但是在文件中看到这个Ã字符。 I changed putty settings for character set to UTF-8 from ISO-8859 to resolve that. 我将字符集的腻子设置从ISO-8859更改为UTF-8,以解决该问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.