简体   繁体   English

如何将值从主机字节顺序转换为小端顺序?

[英]How do I convert a value from host byte order to little endian?

I need to convert a short value from the host byte order to little endian. 我需要将一个短值从主机字节顺序转换为小字节序。 If the target was big endian, I could use the htons() function, but alas - it's not. 如果目标是大端字节序,则可以使用htons()函数,但是可惜-不是。

I guess I could do: 我想我可以做到:

swap(htons(val))

But this could potentially cause the bytes to be swapped twice, rendering the result correct but giving me a performance penalty which is not alright in my case. 但这可能会导致字节被交换两次,从而使结果正确,但会给我带来性能上的损失,在我的情况下还不行。

Here is an article about endianness and how to determine it from IBM: 这是有关字节序以及如何从IBM确定字节序的文章:

Writing endian-independent code in C: Don't let endianness "byte" you 用C编写与字节序无关的代码:不要让字节序成为字节

It includes an example of how to determine endianness at run time ( which you would only need to do once ) 它包括一个如何在运行时确定字节序的示例(您只需执行一次)。

const int i = 1;
#define is_bigendian() ( (*(char*)&i) == 0 )

int main(void) {
    int val;
    char *ptr;
    ptr = (char*) &val;
    val = 0x12345678;
    if (is_bigendian()) {
        printf(“%X.%X.%X.%X\n", u.c[0], u.c[1], u.c[2], u.c[3]);
    } else {
        printf(“%X.%X.%X.%X\n", u.c[3], u.c[2], u.c[1], u.c[0]);
    }
    exit(0);
}

The page also has a section on methods for reversing byte order: 该页面还包含有关反转字节顺序的方法的部分:

short reverseShort (short s) {
    unsigned char c1, c2;

    if (is_bigendian()) {
        return s;
    } else {
        c1 = s & 255;
        c2 = (s >> 8) & 255;

        return (c1 << 8) + c2;
    }
}

; ;

short reverseShort (char *c) {
    short s;
    char *p = (char *)&s;

    if (is_bigendian()) {
        p[0] = c[0];
        p[1] = c[1];
    } else {
        p[0] = c[1];
        p[1] = c[0];
    }

    return s;
}

Then you should know your endianness and call htons() conditionally. 然后,您应该知道自己的字节序,并有条件地调用htons()。 Actually, not even htons, but just swap bytes conditionally. 实际上,甚至不是htons,而是有条件地交换字节。 Compile-time, of course. 当然,编译时。

Something like the following: 类似于以下内容:

unsigned short swaps( unsigned short val)
{
    return ((val & 0xff) << 8) | ((val & 0xff00) >> 8);
}

/* host to little endian */

#define PLATFORM_IS_BIG_ENDIAN 1
#if PLATFORM_IS_LITTLE_ENDIAN
unsigned short htoles( unsigned short val)
{
    /* no-op on a little endian platform */
    return val;
}
#elif PLATFORM_IS_BIG_ENDIAN
unsigned short htoles( unsigned short val)
{
    /* need to swap bytes on a big endian platform */
    return swaps( val);
}
#else
unsigned short htoles( unsigned short val)
{
    /* the platform hasn't been properly configured for the */
    /* preprocessor to know if it's little or big endian    */

    /* use potentially less-performant, but always works option */

    return swaps( htons(val));
}
#endif

If you have a system that's properly configured (such that the preprocessor knows whether the target id little or big endian) you get an 'optimized' version of htoles() . 如果您的系统配置正确(这样预处理器就知道目标ID是小端字节序还是大端字节序),您将获得htoles()的“优化”版本。 Otherwise you get the potentially non-optimized version that depends on htons() . 否则,您将获得依赖于htons()的可能未优化的版本。 In any case, you get something that works. 无论如何,您都会得到一些有用的东西。

Nothing too tricky and more or less portable. 没有什么棘手的东西,或多或少具有便携性。

Of course, you can further improve the optimization possibilities by implementing this with inline or as macros as you see fit. 当然,您可以通过使用inline或作为您认为合适的宏来实现,从而进一步提高优化的可能性。

You might want to look at something like the "Portable Open Source Harness (POSH)" for an actual implementation that defines the endianness for various compilers. 您可能想看一下类似“便携式开源线束(POSH)”的实际实现,它定义了各种编译器的字节序。 Note, getting to the library requires going though a pseudo-authentication page (though you don't need to register to give any personal details): http://hookatooka.com/poshlib/ 请注意,进入图书馆需要进入伪认证页面(尽管您无需注册即可提供任何个人详细信息): http : //hookatooka.com/poshlib/

This trick should would: at startup, use ntohs with a dummy value and then compare the resulting value to the original value. 这个技巧应该是:在启动时,使用带有虚拟值的ntohs ,然后将结果值与原始值进行比较。 If both values are the same, then the machine uses big endian, otherwise it is little endian. 如果两个值相同,则机器使用大端,否则为小端。

Then, use a ToLittleEndian method that either does nothing or invokes ntohs , depending on the result of the initial test. 然后,根据初始测试的结果,使用不执行任何操作或调用ntohsToLittleEndian方法。

(Edited with the information provided in comments) (编辑与在注释提供的信息)

My rule-of-thumb performance guess is that depends whether you are little-endian-ising a big block of data in one go, or just one value: 我的经验法则性能猜测取决于您是用小尾数法一次性生成一大块数据,还是仅获取一个值:

If just one value, then the function call overhead is probably going to swamp the overhead of unnecessary byte-swaps, and that's even if the compiler doesn't optimise away the unnecessary byte swaps. 如果只是一个值,那么函数调用开销可能会淹没不必要的字节交换的开销,即使编译器没有优化掉不必要的字节交换也是如此。 Then you're maybe going to write the value as the port number of a socket connection, and try to open or bind a socket, which takes an age compared with any sort of bit-manipulation. 然后,您可能要将该值写为套接字连接的端口号,并尝试打开或绑定套接字,这与任何类型的位操作相比都需要一定的时间。 So just don't worry about it. 因此,不必担心。

If a large block, then you might worry the compiler won't handle it. 如果块很大,那么您可能会担心编译器无法处理它。 So do something like this: 所以做这样的事情:

if (!is_little_endian()) {
    for (int i = 0; i < size; ++i) {
        vals[i] = swap_short(vals[i]);
    }
}

Or look into SIMD instructions on your architecture which can do it considerably faster. 或者查看您的体系结构上的SIMD指令,可以更快地完成此操作。

Write is_little_endian() using whatever trick you like. 使用任何喜欢的技巧编写is_little_endian() I think the one Robert S. Barnes provides is sound, but since you usually know for a given target whether it's going to be big- or little-endian, maybe you should have a platform-specific header file, that defines it to be a macro evaluating either to 1 or 0. 我认为Robert S. Barnes提供的声音很好,但是由于您通常对于给定的目标知道它是大尾数还是小尾数,因此也许您应该有一个特定于平台的头文件,将其定义为宏计算为1或0。

As always, if you really care about performance, then look at the generated assembly to see whether pointless code has been removed or not, and time the various alternatives against each other to see what actually goes fastest. 与往常一样,如果您真的在乎性能,那么请查看生成的程序集,以查看是否已删除了无意义的代码,然后将各种选择相互竞争,以查看实际运行最快的方法。

Unfortunately, there's not really a cross-platform way to determine a system's byte order at compile-time with standard C. I suggest adding a #define to your config.h (or whatever else you or your build system uses for build configuration). 不幸的是,在使用标准C编译时确定系统的字节顺序并没有真正的跨平台方法。我建议在config.h (或您的构建系统用于构建配置的任何其他对象)添加#define

A unit test to check for the correct definition of LITTLE_ENDIAN or BIG_ENDIAN could look like this: 用于检查LITTLE_ENDIANBIG_ENDIAN的正确定义的单元测试如下所示:

#include <assert.h>
#include <limits.h>
#include <stdint.h>

void check_bits_per_byte(void)
{ assert(CHAR_BIT == 8); }

void check_sizeof_uint32(void)
{ assert(sizeof (uint32_t) == 4); }

void check_byte_order(void)
{
    static const union { unsigned char bytes[4]; uint32_t value; } byte_order =
        { { 1, 2, 3, 4 } };

    static const uint32_t little_endian = 0x04030201ul;
    static const uint32_t big_endian = 0x01020304ul;

    #ifdef LITTLE_ENDIAN
    assert(byte_order.value == little_endian);
    #endif

    #ifdef BIG_ENDIAN
    assert(byte_order.value == big_endian);
    #endif

    #if !defined LITTLE_ENDIAN && !defined BIG_ENDIAN
    assert(!"byte order unknown or unsupported");
    #endif
}

int main(void)
{
    check_bits_per_byte();
    check_sizeof_uint32();
    check_byte_order();
}

On many Linux systems, there is a <endian.h> or <sys/endian.h> with conversion functions. 在许多Linux系统上,都有带有转换功能的<endian.h><sys/endian.h> man page for ENDIAN(3) ENDIAN的手册页(3)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM