简体   繁体   English

Perl中正则表达式中格式错误的UTF-8字符错误

[英]Malformed UTF-8 character error in regular expression in Perl

I have 'Malformed UTF-8 character' error when I'm putting some scalar data in XML::Simple or Data::Dumper. 将一些标量数据放入XML :: Simple或Data :: Dumper时,出现“格式错误的UTF-8字符”错误。 There are regular expressions on the lines where the error occurs. 在发生错误的行上有正则表达式。

Malformed UTF-8 character (fatal) at /usr/share/perl5/XML/Simple.pm line 1690.
Malformed UTF-8 character (fatal) at /usr/lib/perl/5.10/Data/Dumper.pm line 682.

At the moment I failed to reproduce the error with a small piece of code. 目前,我无法用一小段代码重现该错误。

XML::Simple 2.18
Data::Dumper 2.124
perl v5.10.1

之所以出现此问题,是因为在应用程序代码的某个深处,是带有标量的Encode::_utf8_on ,这不是正确的UTF-8字符串。

You could try piping your data through Encoding::FixLatin . 您可以尝试通过Encoding :: FixLatin传递数据。 If the 'binary' bytes you're encountering are actually Latin-1 characters then they'll get converted to valid UTF8. 如果您遇到的“二进制”字节实际上是Latin-1字符,则它们将被转换为有效的UTF8。 If they really are random binary bytes then they should at least get converted to random (but valid) UTF8 characters :-) 如果它们确实是随机二进制字节,则至少应将它们转换为随机(但有效)的UTF8字符:-)

The core Encode module provides facilities for Handling Malformed Data . 核心编码模块提供了处理格式错误数据的功能 I never used them myself, though. 不过,我自己从未使用过它们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM