简体   繁体   English

C ++ Tweetnacl散列文件而不将整个文件读到内存

[英]C++ Tweetnacl hash a file without read whole file to memory

I'm using tweetnacl to generate sha512 hashes of strings and file. 我正在使用tweetnacl生成字符串和文件的sha512散列。 For strings it works quite well but i have no idea how to do it with files. 对于字符串,它工作得很好,但是我不知道如何使用文件。

The signature of the function ist 功能主义的签名

extern "C" int crypto_hash(u8 *out, const u8 *m, u64 n);

where u8 is of type unsigned char and u64 is of unsigend long long. 其中u8是unsigned char类型,u64是unsigend long long。 For string a can use it like that 对于字符串a可以这样使用

string s("Hello");
unsigned char h[64];

crypto_hash(h, (unsigned char *)s.c_str(), s.size());

This works great for a string and small files but if i want to create a hash for a big file, it is not viable and uses to much memory. 这对于字符串和小文件非常有用,但是如果我想为大文件创建哈希,则不可行,并且会占用大量内存。 I searching for a solution to read the file byte by byte and pass it as unsigend char pointer to that function. 我在寻找一种解决方案来逐字节读取文件,并将其作为该函数的unsigend char指针传递。 Has anyone a idea how to achieve that? 有谁知道如何实现这一目标?

PS Sorry for the poor English. PS对不起,英语不好。 pss I use tweetnacl because of the small size and i need only the hashing function. pss我使用tweetnacl是因为它的体积很小,我只需要哈希函数。

Probably the easiest way is to use a memory-mapped file . 可能最简单的方法是使用内存映射文件 This lets you open a file and map it into virtual memory, then you can treat the file on disk as if it is in memory, and the OS will load pages as required. 这样,您可以打开文件并将其映射到虚拟内存,然后可以将磁盘上的文件视为在内存中,然后操作系统将根据需要加载页面。

So in your case, open your file and use mmap() to map it into memory. 因此,根据您的情况,打开文件并使用mmap()将其映射到内存中。 Then you can pass the pointer into your crypto_hash() function and let the OS do the work. 然后,您可以将指针传递到crypto_hash()函数中,然后让OS来完成工作。

Note that there are caveats to do with how large the file is wrt virtual memory. 请注意,需要注意的是文件有多少个虚拟内存。

For various platforms: 对于各种平台:

I'd suggest you to use a different implementation, one which you can incrementally feed in chunks. 我建议您使用另一种实现,您可以以增量方式提供块。

This one for example . 以这个为例 As the licence is bsd and the code is C with no dependencies, you can copy/paste only the 3 functions that you need without bringing a whole lib (no matter how small) into your project. 由于许可证是bsd,代码是C,没有依赖关系,因此您可以仅复制/粘贴所需的3个函数,而无需将整个lib(无论大小)带入项目中。

The life-cycle goes like: 生命周期如下:

  • sha256_init(&ctx)

  • repeatedly read blocks from file and feed them into sha256_update(&ctx, buff, buffLen) 重复读取文件中的块并将其输入sha256_update(&ctx, buff, buffLen)

  • when EOF, get your digest using sha256_final(&ctx, digestHere) 当进行EOF时,使用sha256_final(&ctx, digestHere)获取摘要

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM