简体   繁体   English

如何在perl中将char字符串转换为十六进制

[英]How to convert char string to hex in perl

I read this post: How to convert hex to char string in perl to convert hex to chart string. 我读了这篇文章: 如何在perl中将十六进制转换为char字符串以将十六进制转换为图表字符串。

How can I do reverse operation? 如何进行反向操作? I need convert char string to hex in perl. 我需要在perl中将char字符串转换为十六进制。 For example, I have string " hello world! " and I must get: 例如,我有字符串“ hello world! ”,我必须得到:

00680065006C006C006F00200077006F0072006C00640021

Here's another approach. 这是另一种方法。 Do it all in one go with a regex. 使用正则表达式一次完成所有操作。

my $string = 'hello world!';
$string =~ s/(.)/sprintf '%04x', ord $1/seg;

The exiting answers provide the hex representation of the Unicode Code Points. 现有答案提供了Unicode代码点的十六进制表示。

That format doesn't permit the input to include any characters above 0xFFFF. 该格式不允许输入包含0xFFFF以上的任何字符。 If it were to permit this, there would be no way to know if 如果允许的话,就没有办法知道

20000200002000020000

means 手段

2000 0200 0020 0002 0000

or 要么

20000 20000 20000 20000

If that's fine because you'll never have characters above 0xFFFF, then I recommend the following: 如果那很好,因为您永远不会有0xFFFF以上的字符,那么我建议以下内容:

my $text = 'hello world!';
my $hex = uc unpack 'H*', pack 'n*', unpack 'W*', $text;

It should be much faster than the existing solutions, and it handles characters above 0xFFFF better than the existing solutions (since it still provides only 4 hex digits for characters above 0xFFFF). 它应该比现有解决方案快得多,并且比现有解决方案更好地处理0xFFFF以上的字符(因为对于0xFFFF以上的字符,它仍然仅提供4个十六进制数字)。


If, however, you want to handle all Unicode Code Points, the above solution and the solution provided by the earlier answers aren't adequate. 但是,如果您要处理所有Unicode代码点,则上述解决方案和较早答案提供的解决方案是不够的。

With that in mind, I suspect you actually want the hex representation of the UTF-16be encoding of the Unicode Code Points. 考虑到这一点,我怀疑您实际上是想要Unicode Code Points的UTF-16be编码的十六进制表示形式。 At worse, having a character above 0xFFFF will still produce useful and lossless output. 更糟的是,具有大于0xFFFF的字符仍将产生有用且无损的输出。

Code Point    Perl string lit  JSON string lit  Hex of UCP  Hex of UTF-16be
------------  ---------------  ---------------  ----------  ---------------
h  (U+0068)   "\x{68}          "\u0068"         0068        0068
é  (U+00E9)   "\x{E9}          "\u00E9"         00E9        00E9
ጀ  (U+1300)   "\x{1300}        "\u1300"         1300        1300
𠀀 (U+20000)  "\x{20000}       "\uD840\uDC00"   20000       D840DC00

If that's the case, you want 如果是这样,您要

use Encode qw( encode );

my $text = 'hello world!';
my $hex = uc unpack 'H*', encode 'UTF-16be', $text;

One algorithm you can use to do this is: 您可以使用的一种算法是:

A possible implementation could be 一个可能的实现可能是

print map { sprintf '%04X', ord } split //, 'hello world!';

The output of this program is 该程序的输出是

00680065006C006C006F00200077006F0072006C00640021

That said, there is probably a pack implementation that I am not aware of. 也就是说,可能有一个我不知道的pack实现。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM