使用Interop处理字节序的标准是什么？

Question

I created a very simple DLL for Speck in (granted, probably inefficient) ASM. 我在（授权的，可能效率低下的）ASM中为Speck创建了一个非常简单的DLL。 I connected to it in C# using InteropServices. 我使用InteropServices在C＃中连接到它。

When I tested this crypto with the test vectors provided in the paper describing the algorithm , I found that the only way to get them to come out right was to "flip" the key and the plain text, and then to "flip" the crypto at the end for a match. 当我使用描述算法的论文中提供的测试向量测试该加密货币时，我发现使加密货币正确显示的唯一方法是“翻转”密钥和纯文本，然后“翻转”加密货币最后一场比赛。 So an endianness issue I guess. 所以我想一个字节序问题。 I have seen the same, for example, between a reference implementation of Serpent and TrueCrypt's version -- they produce the same result only with the bytes in the reverse order. 例如，在Serpent和TrueCrypt版本的参考实现之间，我已经看到了相同的结果–它们仅以相反的顺序产生相同的结果。

I will post my assembly code and my C# code for reference, though it may not be critical to see the code in order to understand my question. 我将发布汇编代码和C＃代码以供参考，尽管为了理解我的问题而查看代码可能并不关键。 In the C# code is a click event handler that checks the DLL for consistency with the test vectors. 在C＃代码中是单击事件处理程序，用于检查DLL与测试向量的一致性。 As you can also see there, the program has to do a lot of array flipping in that handler to get the match. 如您所见，该程序必须在该处理程序中进行很多数组翻转才能获得匹配。

So the question I have been working towards is this. 所以我一直在努力的问题是这个。 Should I "flip" those arrays inside the DLL to account for endianness? 我是否应该“翻转” DLL中的那些数组以说明字节顺序？ Or should I leave it to the caller (also me, but C# side)? 还是应该把它留给调用者（也是我，但C＃端）？ Or am I making mountains out of molehills and I should just ignore endianness at this point? 还是我要在丘陵上造山，而此时我应该忽略字节序？ I am not planning to sell the silly thing, so there is no worry about compatibility issues, but I am a stickler for doing things right, so I am hoping you all can guide me on the best practice here if there is one. 我不打算出售这种愚蠢的东西，因此不必担心兼容性问题，但是我坚持做正确的事情，所以我希望大家能在这里指导我进行最佳实践。

ASM: ASM：

.code ; the beginning of the code
 ; section
WinMainCRTStartup proc h:DWORD, r:DWORD, u:DWORD ; the dll entry point
 mov rax, 1 ; if eax is 0, the dll won't
 ; start
 ret ; return
WinMainCRTStartup Endp ; end of the dll entry

_DllMainCRTStartup proc h:DWORD, r:DWORD, u:DWORD ; the dll entry point
 mov rax, 1 ; if eax is 0, the dll won't
 ; start
 ret ; return
_DllMainCRTStartup Endp                                 

SpeckEncrypt proc plaintText:QWORD, cipherText:QWORD, Key:QWORD
; Pass in 3 addresses pointing to the base of the plainText, cipherText, and         Key arrays
; These come in as RCX, RDX, and R8, respectively
; I will use These, RAX, and R9 through R15 for my working space.  Will do 128 bit block, 128 bit key sizes, but they will fit nicely in 64 bit registers

; simple prologue, pushing ebp and ebx and the R# registers, and moving the value of esp into ebp for the duration of the proc  
push rbp
mov rbp,rsp
push rbx
push R9
push R10
push R11
push R12
push R13
push R14
push R15

; Move data into the registers for processing
mov r9,[rcx] ; rcx holds the memory location of the first 64 bits of plainText.  Move this into R9.  This is plainText[0] 
mov r10,[rcx+8] ; put next 64 bits into R10.  This is plainText[1]
;NOTE that the address of the cipherText is in RDX but we will fill r11 and r12 with values pointed at by RCX.  This is per the algorithm.  We will use RDX to output the final bytes
mov r11,[rcx] ; cipherText[0] = plainText[0]
mov r12,[rcx+8] ; cipherText[1] = plainText[1] 
mov r13, [r8] ;First 64 bits of key.  This is Key[0]
mov r14, [r8+8] ; Next 64 bits of key.  This is Key[1]

push rcx ; I could get away without this and loop in another register, but I want to count my loop in rcx so I free it up for that
mov rcx, 0 ; going to count up from here to 32.  Would count down but the algorithm uses the counter value in one permutation, so going to count up

EncryptRoundFunction:
ror r12,8
add r12,r11
xor r12,r13
rol r11,3
xor r11,r12

ror r14,8
add r14,r13
xor r14,rcx
rol r13,3
xor r13,r14

inc rcx
cmp rcx, 32
jne EncryptRoundFunction

pop rcx
; Move cipherText into memory pointed at by RDX.  We won't bother copying the Key or plainText back out
mov [rdx],r11
mov [rdx+8],r12

; Now the epilogue, returning values from the stack into non-volatile registers.
pop R15
pop R14
pop R13
pop R12
pop R11
pop R10
pop R9    
pop rbx    
pop rbp
ret ; return eax
SpeckEncrypt endp ; end of the function

SpeckDecrypt proc cipherText:QWORD, plainText:QWORD, Key:QWORD
; Pass in 3 addresses pointing to the base of the cipherText, plainText, and Key arrays
; These come in as RCX, RDX, and R8, respectively
; I will use These, RAX, and R9 through R15 for my working space.  Will do 128 bit block, 128 bit key sizes, but they will fit nicely in 64 bit registers

; simple prologue, pushing ebp and ebx and the R# registers, and moving the value of esp into ebp for the duration of the proc  
push rbp
mov rbp,rsp
push rbx
push R9
push R10
push R11
push R12
push R13
push R14
push R15

; Move data into the registers for processing
mov r9,[rcx] ; rcx holds the memory location of the first 64 bits of cipherText.  Move this into R9.  This is cipherText[0] 
mov r10,[rcx+8] ; put next 64 bits into R10.  This is cipherText[1]
;NOTE that the address of the plainText is in RDX but we will fill r11 and r12 with values pointed at by RCX.  This is per the algorithm.  We will use RDX to output the final bytes
mov r11,[rcx] ; plainText[0] = cipherText[0]
mov r12,[rcx+8] ; plainText[1] = cipherText[1] 
mov r13, [r8] ;First 64 bits of key.  This is Key[0]
mov r14, [r8+8] ; Next 64 bits of key.  This is Key[1]

push rcx ; I could get away without this and loop in another register, but I want to count my loop in rcx so I free it up for that
mov rcx, 0 ; We will count up while making the round keys

DecryptMakeRoundKeys:
; On encrypt we could make each key just as we needed it.  But here we need the keys in reverse order.  To undo round 31 of encryption, for example, we need round key 31.

; So we will make them all and push them on the stack, pop them off again as we need them in the main DecryptRoundFunction
; I should pull this off and call it for encrypt and decrypt to save space, but for now will have it separate

; push r13 at the beginning of the process because we need a "raw" key by the time we reach decrypt round 0
; We will not push r14 because that half of the key is only used here in the round key generation function.
; We don't need it in the decrypt rounds
push r13

ror r14,8
add r14,r13
xor r14,rcx
rol r13,3
xor r13,r14

inc rcx
cmp rcx, 32
jne DecryptMakeRoundKeys

mov rcx, 32
DecryptRoundFunction:
dec rcx
pop r13

xor r11,r12
ror r11,3
xor r12,r13
sub r12,r11
rol r12,8

cmp rcx, 0
jne DecryptRoundFunction


pop rcx
; Move cipherText into memory pointed at by RDX.  We won't bother copying the Key or plainText back out
mov [rdx],r11
mov [rdx+8],r12

; Now the epilogue, returning values from the stack into non-volatile registers.
pop R15
pop R14
pop R13
pop R12
pop R11
pop R10
pop R9    
pop rbx    
pop rbp
ret ; return eax
SpeckDecrypt endp ; end of the function

End ; end of the dll

And the C#: 和C＃：

using System;
using System.Runtime.InteropServices;
using System.Text;
using System.Threading;
using System.Windows.Forms;

namespace SpeckDLLTest
{
    public partial class Form1 : Form
    {
        byte[] key = { 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09, 0x08, 0x07, 0x06, 0x05, 0x04, 0x03, 0x02, 0x01, 0x00 };
        public Form1()
        {
            InitializeComponent();
            Array.Reverse(key);
        }

        private void richTextBox1_TextChanged(object sender, EventArgs e)
        {
            textBox1.Text = richTextBox1.Text.Length.ToString();
            if (richTextBox1.Text != "")
            {
                byte[] plainText = ASCIIEncoding.ASCII.GetBytes(richTextBox1.Text);
                byte[] cipherText = new byte[plainText.Length];

                Thread t = new Thread(() =>
                {
                    cipherText = Encrypt(plainText);
                    BeginInvoke(new Action(() => richTextBox2.Text = Convert.ToBase64String(cipherText)));
                });
                t.Start();
                t.Join();
                t.Abort();


                byte[] plainAgain = new byte[cipherText.Length];
                t = new Thread(() =>
                    {
                        plainAgain = Decrypt(cipherText);
                        BeginInvoke(new Action(() => richTextBox3.Text = ASCIIEncoding.ASCII.GetString(plainAgain)));
                    });
                t.Start();
                t.Join();
                t.Abort();
            }
            else
            {
                richTextBox2.Text = "";
                richTextBox3.Text = "";
            }
        }

        private byte[] Decrypt(byte[] cipherText)
        {
            int blockCount = cipherText.Length / 16;
            if (cipherText.Length % 16 != 0) blockCount++;
            Array.Resize(ref cipherText, blockCount * 16);
            byte[] plainText = new byte[cipherText.Length];
            unsafe
            {
                fixed (byte* plaintextPointer = plainText, ciphertextPointer = cipherText, keyPointer = key)
                {
                    for (int i = 0; i < blockCount; i++)
                    {
                        for (int j = 0; j < 1; j++)
                        {
                            UnsafeMethods.SpeckDecrypt(ciphertextPointer + i * 16, plaintextPointer + i * 16, keyPointer);
                        }
                    }
                }
            }
            return plainText;
        }

        private byte[] Encrypt(byte[] plainText)
        {
            int blockCount = plainText.Length / 16;
            if (plainText.Length % 16 != 0) blockCount++;
            Array.Resize(ref plainText, blockCount * 16);
            byte[] cipherText = new byte[plainText.Length];
            unsafe
            {
                fixed (byte* plaintextPointer = plainText, ciphertextPointer = cipherText, keyPointer = key)
                {
                    for (int i = 0; i < blockCount; i++)
                    {
                        for (int j = 0; j < 1; j++)
                        {
                            UnsafeMethods.SpeckEncrypt(plaintextPointer + i * 16, ciphertextPointer + i * 16, keyPointer);
                        }
                    }
                }
            }
            return cipherText;
        }

        private void button1_Click(object sender, EventArgs e)
        {
            byte[] plainText = { 0x6c, 0x61, 0x76, 0x69, 0x75, 0x71, 0x65, 0x20, 0x74, 0x69, 0x20, 0x65, 0x64, 0x61, 0x6d, 0x20 };
            byte[] key = { 0x0f, 0x0e, 0x0d, 0x0c, 0x0b, 0x0a, 0x09, 0x08, 0x07, 0x06, 0x05, 0x04, 0x03, 0x02, 0x01, 0x00 };
            byte[] testVector = { 0xa6, 0x5d, 0x98, 0x51, 0x79, 0x78, 0x32, 0x65, 0x78, 0x60, 0xfe, 0xdf, 0x5c, 0x57, 0x0d, 0x18 };

            Array.Reverse(key);
            Array.Reverse(plainText);

            byte[] cipherText = new byte[16];
            unsafe
            {
                fixed (byte* plaintextPointer = plainText, ciphertextPointer = cipherText, keyPointer = key)
                {
                    UnsafeMethods.SpeckEncrypt(plaintextPointer, ciphertextPointer, keyPointer);
                    Array.Reverse(cipherText);
                    bool testBool = true;
                    for (int i = 0; i < cipherText.Length; i++)
                    {
                        if (testVector[i] != cipherText[i]) testBool = false;
                    }
                    if (testBool == false) MessageBox.Show("Failed!");
                    else MessageBox.Show("Passed!");
                }
            }
        }
    }

    public static class UnsafeMethods
    {
        [DllImport("Speck.dll")]
        unsafe public extern static void SpeckEncrypt(byte* plainText, byte* cipherText, byte* Key);
        [DllImport("Speck.dll")]
        unsafe public extern static void SpeckDecrypt(byte* cipherText, byte* plainText, byte* Key);
    }
}

Answer 1

Whether someone might like it or not, the de facto standard for byte order when it comes to networking and cryptography is big-endian (most significant byte first — the “natural” order). 无论是否喜欢，在网络和密码学方面，字节顺序的事实上的标准是big-endian （最高有效字节在先，即“自然”顺序）。 This applies not only to serialization of data for inter-system exchange, but to intra-system API as well and for any other case where caller is not supposed to be aware of callee internals. 这不仅适用于系统间交换的数据序列化，而且还适用于系统内API以及任何其他不应使调用方知道被调用方内部信息的情况。 This convention does not have anything to do with endianness of particular hardware and popularity of such hardware. 该约定与特定硬件的字节序以及此类硬件的普及程度无关。 It just sets the default format for exchanged data, so that both lower-level and higher-level programs may pass data around without regard to their degree of awareness of what this data contains and how it is processed. 它只是设置了交换数据的默认格式，因此低级和高级程序都可以传递数据，而无需考虑它们对这些数据包含的内容和处理方式的了解程度。

However, if the caller is supposed to be tightly coupled with the callee, it may be more convenient and performance-wise to pass the data in a more preprocessed form, especially if some of that data remains constant across invocations. 但是，如果假定调用方与被调用方紧密耦合，则以更预处理的形式传递数据可能更方便且性能更佳，尤其是在某些数据在调用之间保持不变的情况下。 For example, if we are dealing with asymmetric cryptography, it may be easier and faster to call the core functions with all data already translated to big integers, and for those we may prefer little-endian digit order (a “digit” or a “limb” is usually a half of largest available register) even on a big-endian byte order hardware — simply because such an order of digits is more useful for arbitrary-precision math library. 例如，如果我们正在处理非对称密码，则调用所有已转换为大整数的数据的核心函数可能会更容易，更快捷，对于那些我们更喜欢使用小尾数位顺序 （“数字”或“ “肢体”通常是最大可用寄存器的一半），即使在使用大尾数字节顺序的硬件上也是如此-仅仅是因为这样的数字顺序对于任意精度的数学库更有用。 But those details should not be visible to the outside world — for anyone else, we are accepting and returning big-endian bytestream. 但是这些细节对于外界应该是不可见的-对于其他任何人，我们正在接受并返回big-endian字节流。

Regarding your specific task. 关于您的特定任务。

As @RossRidge already pointed out, you are probably very wrong if your are simply flipping entire arrays, — you should swap bytes ( BSWAP ) in particular pieces being processed rather than inverting the order of those pieces besides that. 正如@RossRidge已经指出的那样，如果您只是翻转整个数组，则可能是非常错误的—您应该交换正在处理的特定块的字节（ BSWAP ），而不是颠倒这些块的顺序。
Chances are high that you are very overestimating your ability to write efficient machine code: for example, you don't interleave instructions with unrelated registers for better out-of-order execution, your loop is not aligned, you use counter increase to N instead of decrease to zero. 您很有可能高估了编写高效机器代码的能力：例如，您没有将指令与无关的寄存器交织以更好地无序执行，您的循环未对齐，而是将计数器增加为N减少到零。 Of course, that code will still be 10x faster than .Net anyway, but I strongly recommend you to write an implementation in C and benchmark — to get amazed of how good a compiler (MSVC, GCC) may be at optimizing even a straight-though written program (believe me, I once committed the same mistake when trying to accomplish the same task). 当然，无论如何，该代码仍将比.Net快10倍，但是我强烈建议您使用C和基准测试编写一个实现-令编译器（MSVC，GCC）在优化甚至是直接优化方面的出色表现令人惊讶尽管是书面程序（相信我，在尝试完成相同任务时我曾经犯过相同的错误）。 If performance is not a big issue, do not mess with unmanaged code at all, — because it is just an external non-portable dependency that increases required trust level for you .Net application. 如果性能不是大问题，那么根本不要搞乱非托管代码，因为这只是一个外部不可移植的依赖关系，它增加了.Net应用程序所需的信任级别。
Use .Net functions dealing with bytes with caution, because they are very inconsistent with regard to endianness: BitConverter uses host byte order, StreamReader always sticks to little-endian, and String is all about the encoding given (of all UTF encodings, only UTF-8 is endian-agnostic). 请谨慎使用.Net函数处理字节，因为它们在字节序方面非常不一致： BitConverter使用主机字节顺序， StreamReader始终遵循小字节序，而String就是给定的编码（在所有UTF编码中，只有UTF编码） -8不区分字节序）。

That are the issues I noticed at first glance. 那是我乍看之下的问题。 There may be more of them. 可能还有更多。

使用Interop处理字节序的标准是什么？

问题描述

1 个解决方案

解决方案1
3 已采纳 2015-08-19 18:52:36

使用Interop处理字节序的标准是什么？

问题描述

1 个解决方案

解决方案1 3 已采纳 2015-08-19 18:52:36

解决方案1
3 已采纳 2015-08-19 18:52:36