简体   繁体   English

SHA256 (RFC 4634) 中“Length_Low”和“Length_High”的含义?

[英]Meaning of "Length_Low" and "Length_High" in SHA256 (RFC 4634)?

I was studying SHA2 today, and got to the place in code which I do no understand.我今天在学习SHA2,到了我看不懂的代码的地方。

RFC 4634 (where SHA2 is defined) defines two variables, Length_Low and Length_High , in the sha.h file: RFC 4634 (其中定义了 SHA2)在sha.h文件中定义了两个变量Length_LowLength_High

沙.h

In sha224-256.c file the variable Length_Low is actively changed at two places, for example.例如,在sha224-256.c文件中,变量Length_Low在两个地方被主动更改。 Here:这里:

低1

and here:和这里:

低_2

I understood RFC 4634 and the principles of SHA2.我了解 RFC 4634 和 SHA2 的原理。 However, being a beginner in C I can understand only 99% of the code from the paper.但是,作为 C 的初学者,我只能理解论文中 99% 的代码。 The pictures I attached belong to this remaining 1%.我附上的图片属于这剩下的1%。

Could you please explain me, what purpose the Length_Low and Length_High variables play in the implementation of SHA2?您能否解释一下, Length_LowLength_High变量在 SHA2 的实现中起什么作用? What is their meaning on the code level?它们在代码级别的含义是什么?

Secondly, what happens in the two latter images?其次,后两个图像中发生了什么? I can identify compound operators and shifts, but I am overwhelmed by the difficulty of the code, - especially in the second picture, where de-referencing, increment, definition, etc. happen in the same line of code.我可以识别复合运算符和移位,但我对代码的难度感到不知所措——尤其是在第二张图片中,取消引用、递增、定义等发生在同一行代码中。

Meta: stackexchange, and especially stackoverflow, policy is not to post images of 'code', defined expansively to include things like config files and log or error messages, because they are very hard to read on mobile devices, impossible to read by visually impaired people, not cut&pastable, and not searchable.元:stackexchange,尤其是stackoverflow,政策是不发布“代码”的图像,广泛定义为包括配置文件和日志或错误消息等内容,因为它们在移动设备上很难阅读,视障者无法阅读人,不可剪切和粘贴,不可搜索。 Plus yours, reformatted and colored I presume by your IDE, are to my taste exceptionally ugly.加上你的,我认为你的 IDE 重新格式化和着色,在我看来非常难看。 Fortunately all RFCs are published in text form, which I could easily substitute.幸运的是,所有的 RFC 都以文本形式发布,我可以很容易地替代它。

Also, an aside: the input block size for SHA-256 and SHA-224 (and SHA-1 and MD5 before them) is 64 octets or 512 bits, not 64 bits, and anyway has nothing to do with the length field.另外,顺便说一句:SHA-256 和 SHA-224(以及它们之前的 SHA-1 和 MD5)的输入块大小是 64个八位字节或 512 位,而不是 64 位,并且无论如何与长度字段无关。 The length field is indeed 64 bits and implemented in the code as two 32-bit variables for the high and low half.长度字段确实是 64 位,并在代码中实现为高半和低半的两个 32 位变量。

static uint32_t addTemp;
#define SHA224_256AddLength(context, length)               \
  (addTemp = (context)->Length_Low, (context)->Corrupted = \
    (((context)->Length_Low += (length)) < addTemp) &&     \
    (++(context)->Length_High == 0) ? 1 : 0)

is rather tricky code to add the length of a piece of input data (as actually used always 8 for a full octet or 1-7 for leftover bits) to the 2x32-bit length field.将一段输入数据的长度(实际上始终使用 8 表示完整的八位字节或 1-7 表示剩余位)添加到 2x32 位长度字段是相当棘手的代码。 First it saves the incoming Length_Low in addTemp , and then, from the inside out:首先它将传入的Length_Low保存在addTemp中,然后从内到外:

(context)->Length_Low += (length) // call this CODE1

adds the value of length to the Length_Low field in the structure;length的值添加到结构中的Length_Low字段; because this uses unsigned arithmetic in C, if the result (mathematically) overflows it is wrapped around (taken modulo 2 32 ).因为这在 C 中使用无符号算术,如果结果(数学上)溢出,它会被环绕(取模 2 32 )。 Thus the sum (the new value in Length_low ) is smaller than the original value in addTemp if and only if overflow/wraparound occurred.因此,当且仅当发生溢出/环绕时,总和( addTemp中的新值)小于Length_low中的原始值。 This is tested and in that case the Length_High field is incremented:这是经过测试的,在这种情况下, Length_High字段会增加:

( CODE1 < addTemp) && (++(context)->Length_High == 0) // call this CODE2

If after incrementing the high half is zero, that means it also overflowed/wrapped-around, which means the real message length is too big to fit in a 64-bit field, as the spec requires, so this is considered an error and stored in the Corrupted field, which will be tested later to report that the hashing operation failed:如果增加高半部分后为零,这意味着也溢出/环绕,这意味着真正的消息长度太大而无法按照规范要求放入 64 位字段,因此这被认为是错误并存储在Corrupted字段中,稍后将对其进行测试以报告哈希操作失败:

(addTemp=..., (context)->Corrupted = CODE2 ? 1 : 0)

It should be noted the ? 1: 0应该注意? 1: 0 ? 1: 0 is technically unnecessary; ? 1: 0在技术上是不必要的; the && and || &&|| operators in C (and also the comparison/equality operators like < and == ) are defined to return one for true and zero for false already (although tests like if(x) and while(x) accept any nonzero value as true). C 中的运算符(以及像<==这样的比较/相等运算符)已经定义为返回 1 表示真,返回 0 表示假(尽管if(x)while(x)测试接受任何非零值作为真)。 However some people feel writing this out is clearer, and that motivation is especially strong in an RFC which is published to a wide audience including those (like you.) with little knowledge of C.然而,有些人觉得写出来更清楚,而且这种动机在 RFC 中尤其强烈,该 RFC 发布给广泛的受众,包括那些(像你一样)对 C 知之甚少的人。

In fact, it might have been better to write this as a (very small) function instead of a macro, which would allow use of more obvious statements instead of complicated nested expressions, and which any decent compiler in 2006 (much less now) would inline and fold to produce the same code as the macro.事实上,将其写成(非常小的)function 而不是宏可能会更好,这将允许使用更明显的语句而不是复杂的嵌套表达式,并且 2006 年任何体面的编译器(现在少得多)都会inline 和 fold 以生成与宏相同的代码。 But the world isn't perfect.但这个世界并不完美。

Your third chunk, in comparison, is quite simple.相比之下,您的第三块非常简单。 It just takes the 2x32-bit length field and stores it as a big-endian sequence of 8 8-bit units in the last 8 elements of the current Message_Block .它只采用 2x32 位长度字段并将其作为 8 个 8 位单元的大端序列存储在当前Message_Block的最后 8 个元素中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM