简体   繁体   English

将 uint64_t 转换为 uint8_t[8]

[英]Convert uint64_t to uint8_t[8]

How can I convert uint64_t to uint8_t[8] without loosing information in C++?如何在不丢失 C++ 中的信息的情况下将uint64_t转换为uint8_t[8]

I tried the following:我尝试了以下方法:

uint64_t number = 23425432542254234532;
uint8_t result[8];
for(int i = 0; i < 8; i++) {
    std::memcpy(result[i], number, 1);
}

You are almost there.你快到了。 Firstly, the literal 23425432542254234532 is too big to fit in uint64_t .首先,文字23425432542254234532太大而无法放入uint64_t

Secondly, as you can see from the documentation, std::memcpy has the following declaration:其次,正如您从文档中看到的, std::memcpy具有以下声明:

void * memcpy ( void * destination, const void * source, size_t num );

As you can see, it takes pointers (addresses) as arguments.如您所见,它接受指针(地址)作为参数。 Not uint64_t , nor uint8_t .不是uint64_t ,也不是uint8_t You can easily get the address of the integer using the address-of operator.您可以使用 address-of 运算符轻松获取整数的地址。

Thridly, you are only copying the first byte of the integer into each array element.第三,您只是将整数的第一个字节复制到每个数组元素中。 You would need to increment the input pointer in every iteration.您需要在每次迭代中增加输入指针。 But the loop is unnecessary.但是循环是不必要的。 You can copy all bytes in one go like this:您可以像这样一次性复制所有字节:

std::memcpy(result, &number, sizeof number);

Do realize that the order of the bytes depend on the endianness of the cpu.请注意字节的顺序取决于 cpu 的字节序

First, do you want the conversion to be big-endian, or little-endian?首先,您希望转换是大端还是小端? Most of the previous answers are going to start giving you the bytes in the opposite order, and break your program,` as soon as you switch architectures.以前的大多数答案将开始以相反的顺序为您提供字节,并在您切换架构后立即破坏您的程序。

If you need to get consistent results, you would want to convert your 64-bit input into big-endian (network) byte order, or perhaps to little-endian.如果您需要获得一致的结果,您可能希望将 64 位输入转换为大端(网络)字节顺序,或者可能是小端。 For example, on GNU glib, the function is GUINT64_TO_BE() , but there is an equivalent built-in function for most compilers.例如,在 GNU glib 上,函数是GUINT64_TO_BE() ,但对于大多数编译器都有一个等效的内置函数。

Having done that, there are several alternatives:完成后,有几种选择:

Copy with memcpy() or memmove()使用 memcpy() 或 memmove() 复制

This is the method that the language standard guarantees will work, although here I use one function from a third-party library (to convert the argument to big-endian byte order on all platforms).这是语言标准保证可以工作的方法,尽管在这里我使用了第三方库中的一个函数(在所有平台上将参数转换为大端字节序)。 For example:例如:

#include <stdint.h>
#include <stdlib.h>

#include <glib.h>

union eight_bytes {
  uint64_t u64;
  uint8_t b8[sizeof(uint64_t)];
};

eight_bytes u64_to_eight_bytes( const uint64_t input )
{
  eight_bytes result;
  const uint64_t big_endian = (uint64_t)GUINT64_TO_BE((guint64)input);

  memcpy( &result.b8, &big_endian, sizeof(big_endian) );
  return result;
}

On Linux x86_64 with clang++ -std=c++17 -O , this compiles to essentially the instructions:在带有clang++ -std=c++17 -O Linux x86_64 上,这基本上编译为指令:

bswapq  %rdi
movq    %rdi, %rax
retq

If you wanted the results in little-endian order on all platforms, you could replace GUINT64_TO_BE() with GUINT64_TO_LE() and remove the first instruction, then declare the function inline to remove the third instruction.如果您希望在所有平台上以小端顺序显示结果,您可以将GUINT64_TO_BE()替换为GUINT64_TO_LE()并删除第一条指令,然后inline声明函数以删除第三条指令。 (Or, if you're certain that cross-platform compatibility does not matter, you might risk just omitting the normalization.) (或者,如果您确定跨平台兼容性无关紧要,您可能会冒着忽略规范化的风险。)

So, on a modern, 64-bit compiler, this code is just as efficient as anything else.因此,在现代 64 位编译器上,此代码与其他任何代码一样高效。 On another target, it might not be.在另一个目标上,它可能不是。

Type-Punning打字

The common way to write this in C would be to declare the union as before, set its uint64_t member, and then read its uint8_t[8] member.在 C 中编写它的常用方法是像以前一样声明union ,设置其uint64_t成员,然后读取其uint8_t[8]成员。 This is legal in C.这在 C 中是合法的。

I personally like it because it allows me to express the entire operation as static single assignments.我个人喜欢它,因为它允许我将整个操作表示为静态单个赋值。

However, in C++, it is formally undefined behavior.但是,在 C++ 中,它是形式上未定义的行为。 In practice, all C++ compilers I'm aware of support it for Plain Old Data (the formal term in the language standard), of the same size, with no padding bits, but not for more complicated classes that have virtual function tables and the like.在实践中,我知道的所有 C++ 编译器都支持普通旧数据(语言标准中的正式术语),大小相同,没有填充位,但不支持具有虚函数表和喜欢。 It seems more likely to me that a future version of the Standard will officially support type-punning on POD than that any important compiler will ever break it silently.在我看来,标准的未来版本更有可能在 POD 上正式支持类型双关,而不是任何重要的编译器都会默默地破坏它。

The C++ Guidelines Way C++ 指南方式

Bjarne Stroustrup recommended that, if you are going to type-pun instead of copying, you use reinterpret_cast , such as Bjarne Stroustrup 建议,如果您要键入双关语而不是复制,请使用reinterpret_cast ,例如

uint8_t (&array_of_bytes)[sizeof(uint64_t)] =
      *reinterpret_cast<uint8_t(*)[sizeof(uint64_t)]>(
        &proper_endian_uint64);

His reasoning was that both an explicit cast and type-punning through a union are undefined behavior, but the cast makes it blatant and unmistakable that you are shooting yourself in the foot on purpose, whereas reading a different union member than the active one can be a very subtle bug.他的理由是,明确的演员表和通过union类型双关语都是未定义的行为,但演员表公然和明确无误地表明你是故意用脚射击自己,而阅读与活跃成员不同的union成员可能是一个非常微妙的错误。

If I understand correctly you can do this that way for instance:如果我理解正确,您可以这样做,例如:

uint64_t number = 23425432542254234532;
uint8_t *p = (uint8_t *)&number;
//if you need a copy
uint8_t result[8];
for(int i = 0; i < 8; i++) {
    result[i] = p[i];
}

When copying memory around between incompatible types, the first thing to be aware of is strict aliasing - you don't want to alias pointers incorrectly.在不兼容的类型之间复制内存时,首先要注意的是严格别名 - 您不想错误地为指针别名。 Alignment is also to be considered.对齐也需要考虑。

You were almost there, the for is not needed.你快到了,不需要for

uint64_t number = 0x2342543254225423; // trimmed to fit
uint8_t result[sizeof(number)];
std::memcpy(result, &number, sizeof(number));

Note: be aware of the endianness of the platform as well.注意:还要注意平台的字节序。

Either use a union, or do it with bitwise operations- memcpy is for blocks of memory and might not be the best option here.要么使用联合,要么使用按位运算来实现——memcpy 用于内存块,可能不是这里的最佳选择。

uint64_t number = 23425432542254234532;
uint8_t result[8];
for(int i = 0; i < 8; i++) {
    result[i] = uint8_t((number >> 8*(7 - i)) & 0xFF);
}

Or, although I'm told this breaks the rules, it works on my compiler:或者,虽然我被告知这违反了规则,但它适用于我的编译器:

union
{
    uint64_t a;
    uint8_t b[8];
};

a = 23425432542254234532;
//Can now read off the value of b
uint8_t copy[8];
for(int i = 0; i < 8; i++)
{
    copy[i]= b[i];
}

The packing and unpacking can be done with masks.包装和拆包可以用口罩完成。 One more thing to worry about is the byte order.需要担心的另一件事是字节顺序。 packing and unpacking should use the same byte order.打包和解包应该使用相同的字节顺序。 Beware - This is not super efficient implementation and do not come with guarantees on small CPU that are not native 64-bit.当心 - 这不是非常高效的实现,并且不保证非本机 64 位的小型 CPU。

void unpack_uint64(uint64_t number, uint8_t *result) {

    result[0] = number & 0x00000000000000FF ; number = number >> 8 ;
    result[1] = number & 0x00000000000000FF ; number = number >> 8 ;
    result[2] = number & 0x00000000000000FF ; number = number >> 8 ;
    result[3] = number & 0x00000000000000FF ; number = number >> 8 ;
    result[4] = number & 0x00000000000000FF ; number = number >> 8 ;
    result[5] = number & 0x00000000000000FF ; number = number >> 8 ;
    result[6] = number & 0x00000000000000FF ; number = number >> 8 ;
    result[7] = number & 0x00000000000000FF ;

}



uint64_t  pack_uint64(uint8_t *buffer) {

    uint64_t value ;

    value = buffer[7] ;
    value = (value << 8 ) + buffer[6] ;
    value = (value << 8 ) + buffer[5] ;
    value = (value << 8 ) + buffer[4] ;
    value = (value << 8 ) + buffer[3] ;
    value = (value << 8 ) + buffer[2] ;
    value = (value << 8 ) + buffer[1] ;
    value = (value << 8 ) + buffer[0] ;

    return value ;

}
#include<cstdint>
#include<iostream>

 struct ByteArray
{
    uint8_t b[8] = { 0,0,0,0,0,0,0,0 };
};

ByteArray split(uint64_t x)
{
    ByteArray pack;
    const uint8_t MASK = 0xFF;
    for (auto i = 0; i < 7; ++i)
    {
        pack.b[i] = x & MASK;
        x = x >> 8;
    }
    return pack;
}
int main()
{
    uint64_t val_64 = UINT64_MAX;
    auto pack = split(val_64);
    for (auto i = 0; i < 7; ++i)
    {
        std::cout << (uint32_t)pack.b[i] << std::endl;
    }
    system("Pause");
    return 0;
}

Although union approach which is addressed by Straw1239 is better and cleaner.Please do care about compiler/platform compatibility with endianness .尽管Straw1239解决的union方法更好更Straw1239请注意编译器/平台与endianness兼容性。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM