简体   繁体   English

将零填充添加到数组

[英]Adding Zero padding to an array

I am doing a GHASH for the AES-GCM implementation. 我正在为AES-GCM实现做一个GHASH。

GHASH

and i need to implement this 我需要实现这一点

等式

where v is the bit length of the final block of A, u is the bit length of the final block of C, and || 其中v是A的最后一个块的位长度,u是C的最后一个块的位长度,|| denotes concatenation of bit strings. 表示位串的串联。

How can I do the concatenation of A block to fill in the zeros padding from v to 128 bit, as I do not know the length of the whole block of A. So I just take the A block and XOR it with an array of 128 bits 如何进行A块的串联以填充从0到128位的零填充,因为我不知道整个A块的长度。所以我只需要A块并将其与一个128的数组进行异或运算。位

void GHASH(uint8_t H[16], uint8_t len_A, uint8_t A_i[len_A], uint8_t len_C,
    uint8_t C_i[len_C], uint8_t X_i[16]) {

uint8_t m;
uint8_t n;
uint8_t i;
uint8_t j;
uint8_t zeros[16] = {0};

 if (i == m + n) {
        for(j=16; j>=0; j--){
        C_i[j] = C_i[j] ^ zeros[j]; //XOR with zero array to fill in 0 of length 128-u
        tmp[j] = X_i[j] ^ C_i[j]; // X[m+n+1] XOR C[i] left shift by (128bit-u) and store into tmp
        gmul(tmp, H, X_i); //Do Multiplication of tmp to H and store into X
        }
    }

I am pretty sure that I am not correct. 我很确定我不对。 But I have no idea how to do it. 但我不知道该怎么做。

if you don't care about every little bit of efficiency (i assume this is to experiment, and not for real use?) just reallocate and pad (in practice, you could round up and calloc when you first declare these): 如果你不关心每一点效率(我假设这是实验,而不是实际使用?)只需重新分配和填充(实际上,你可以在第一次声明这些时可以向上舍入和calloc):

size_t round16(size_t n) {
    // if n isn't a multiple of 16, round up to next multiple
    if (n % 16) return 16 * (1 + n / 16);
    return n;
}

size_t realloc16(uint8_t **data, size_t len) {
    // if len isn't a multiple of 16, extend with 0s to next multiple
    size_t n = round16(len);
    *data = realloc(*data, n);
    for (size_t i = len; i < n; ++i) (*data)[i] = 0;
    return n;
}

void xor16(uint8_t *result, uint8_t *a, uint8_t *b) {
    // 16 byte xor
    for (size_t i = 0; i < 16; ++i) result[i] = a[i] ^ b[i];
}

void xorandmult(uint8_t *x, uint8_t *data, size_t n, unint8_t *h) {
    // run along the length of the (extended) data, xoring and mutliplying
    uint8_t tmp[16];
    for (size_t i = 0; i < n / 16; ++i) {
        xor16(tmp, x, data+i*16);
        multgcm(x, h, tmp);
    }
}

void ghash(uint8_t *x, uint8_t **a, size_t len_a, uint8_t **c, size_t len_c, uint8_t *h) {
    size_t m = realloc16(a, len_a);
    xorandmult(x, *a, m, h);
    size_t n = realloc16(c, len_c);
    xorandmult(x, *c, n, h);

    // then handle lengths
}

uint8_t x[16] = {0};
ghash(x, &a, len_a, &c, len_c, h);

disclaimer - no expert, just skimmed the spec. 免责声明 - 没有专家,只是浏览了规范。 code uncompiled, unchecked, and not intended for "real" use. 代码未编译,未经检查,并非用于“真实”使用。 also, the spec supports arbitrary (bit) lengths, but i assume you're working in bytes. 此外,规范支持任意(位)长度,但我假设你在字节中工作。

also, i am still not sure i am answering the right question. 另外,我仍然不确定我是否回答了正确的问题。

It seems to me that you've got several issues here, and conflating them is a big part of the problem. 在我看来,你在这里遇到了几个问题,并且将它们混为一谈是问题的重要组成部分。 It'll be much easier when you separate them. 分开它们会更容易。

  • First: passing in a parameter of the form uint8_t len_A, uint8_t A_i[len_A] is not proper syntax and won't give you what you want. 首先:传入uint8_t len_A, uint8_t A_i[len_A]形式的参数uint8_t len_A, uint8_t A_i[len_A]是不正确的语法,不会给你你想要的。 You're actually getting uint8_t len_A, uint8_t * A_i , and the length of A_i is determined by how it was declared on the level above, not how you tried to pass it in. (Note that uint8_t * A and uint8_t A[] are functionally identical here; the difference is mostly syntactic sugar for the programmer.) 你实际上得到uint8_t len_A, uint8_t * A_iuint8_t len_A, uint8_t * A_i的长度取决于它在上面的级别上的声明,而不是你试图传递它的方式。(注意uint8_t * Auint8_t A[]是在这里功能相同;差异主要是程序员的语法糖。)

  • On the level above, since I don't know if it was declared by malloc() or on the stack, I'm not going to get fancy with memory management issues. 在上面的层面上,因为我不知道它是由malloc()还是在堆栈上声明的,所以我不会对内存管理问题感到满意。 I'm going to use local storage for my suggestion. 我将根据我的建议使用本地存储。

  • Unit clarity: You've got a bad case going on here: bit vs. byte vs. block length. 单位清晰度:这里有一个不好的情况:位与字节对块长度。 Without knowing the core algorithm, it appears to me that the undeclared m & n are block lengths of A & C; 在不知道核心算法的情况下,在我看来,未声明的m&n是A&C的块长度; ie, A is m blocks long, and C is n blocks long, and in both cases the last block is not required to be full length. 即,A是m个块长,C是n个块长,并且在两种情况下,最后一个块不需要是全长的。 You're passing in len_A & len_C without telling us (or using them in code so we can see) whether they're the bit length u/v, the byte length of A_i/C_i, or the total length of A/C, in bits or bytes or blocks. 你传递的是len_A和len_C而没有告诉我们(或者在代码中使用它们,所以我们可以看到)它们是u / v的位长,A_i / C_i的字节长度,还是A / C的总长度,以位或字节或块。 Based on the (incorrect) declaration, I'm assuming they're the length of A_i/C_i in bytes, but it's not obvious... nor is it the obvious thing to pass. 基于(不正确的)声明,我假设它们是A_i / C_i的长度,以字节为单位,但它并不明显......也不是明显的通过。 By the name, I would have guessed it to be the length of A/C in bits. 通过这个名字,我猜它是比特的A / C长度。 Hint: if your units are in the names, it becomes obvious when you try to add bitLenA to byteLenB. 提示:如果你的单位在名称中,当你尝试将bitLenA添加到byteLenB时就很明显了。

  • Iteration control: You appear to be passing in 16-byte blocks for the i'th iteration, but not passing in i. 迭代控制:你似乎在第i次迭代时传入16字节的块,但是没有传入i。 Either pass in i, or pass in the full A & C instead of A_i & C_i. 要么传入i,要么传入完整的A&C而不是A_i和C_i。 You're also using m & n without setting them or passing them in; 你也在使用m&n而不设置它们或传入它们; the same issue applied. 同样的问题适用。 I'll just pretend they're all correct at the moment of use and let you fix that. 我会假装他们在使用的那一刻都是正确的,让你解决这个问题。

  • Finally, I don't understand the summation notation for the i=m+n+1 case, in particular how len(A) & len(C) are treated, but you're not asking about that case so I'll ignore it. 最后,我不明白i = m + n + 1案例的求和符号,特别是如何处理len(A)和len(C),但你不是在询问那个案例,所以我会忽略它。

Given all that, let's look at your function: 鉴于这一切,让我们来看看你的功能:

void GHASH(uint8_t H[], uint8_t len_A, uint8_t A_i[], uint8_t len_C, uint8_t C_i[], uint8_t X_i[]) {

    uint8_t tmpAC[16] = {0};
    uint8_t tmp[16];
    uint8_t * pAC = tmpAC;

    if (i == 0) {         // Initialization case
        for (j=0; j<len_A; ++j) {
            X_i[j] = 0;
        }
        return;
    } else if (i < m) {   // Use the input memory for A
        pAC = A_i;
    } else if (i == m) {     // Use temp memory init'ed to 0; copy in A as far as it goes
        for (j=0; j<len_A; ++j) {
            pAC[j] = A_i[j];
        }
    } else if (i < m+n) {    // Use the input memory for C
        pAC = C_i;
    } else if (i == m+n) {   // Use temp memory init'ed to 0; copy in C as far as it goes
        for (j=0; j<len_A; ++j) {
            pAC[j] = C_i[j];
        }
    } else if (i == m+n+1) { // Do something unclear to me. Maybe this?
       // Use temp memory init'ed to 0; copy in len(A) & len(C)
       pAC[0] = len_A;  // in blocks?  bits?  bytes?
       pAC[1] = len_C;  // in blocks?  bits?  bytes?
    }

    for(j=16; j>=0; j--){
        tmp[j] = X_i[j] ^ pAC[j]; // X[m+n+1] XOR A or C[i] and store into tmp
        gmul(tmp, H, X_i); //Do Multiplication of tmp to H and store into X
    }
}

We only copy memory in the last block of A or C, and use local memory for the copy. 我们只复制A或C的最后一个块中的内存,并使用本地内存进行复制。 Most blocks are handled with a single pointer copy to point to the correct bit of input memory. 大多数块都使用单个指针复制进行处理,以指向输入存储器的正确位。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM