简体   繁体   English

计算torrent文件的info-hash

[英]Calculating the info-hash of a torrent file

I'm using C++ to parse the info hash of a torrent file, and I am having trouble getting a "correct" hash value in comparison to this site: 我正在使用C ++来解析torrent文件的信息哈希,与此站点相比,我无法获得“正确”的哈希值:

http://i-tools.org/torrent http://i-tools.org/torrent

I have constructed a very simple toy example just to make sure I have the basics right. 我构建了一个非常简单的玩具示例,以确保我有正确的基础知识。

I opened a .torrent file in sublime and stripped off everything except for the info dictionary, so I have a file that looks like this: 我在sublime中打开了一个.torrent文件并删除了除信息字典之外的所有内容,所以我有一个如下所示的文件:

d6:lengthi729067520e4:name31:ubuntu-12.04.1-desktop-i386.iso12:piece lengthi524288e6:pieces27820:¡´E¶ˆØËš3í   ..............(more unreadable stuff.....)..........

I read this file in and parse it with this code: 我读了这个文件并用这段代码解析它:

#include <string>
#include <sstream>
#include <iomanip>
#include <fstream>
#include <iostream>

#include <openssl/sha.h>


void printHexRep(const unsigned char * test_sha) {

    std::cout << "CALLED HEX REP...PREPPING TO PRINT!\n";
    std::ostringstream os;
    os.fill('0');
    os << std::hex;
    for (const unsigned char * ptr = test_sha; ptr < test_sha + 20; ptr++) {

        os << std::setw(2) << (unsigned int) *ptr;
    }
    std::cout << os.str() << std::endl << std::endl;
}


int main() {

    using namespace std;

    ifstream myFile ("INFO_HASH__ubuntu-12.04.1-desktop-i386.torrent", ifstream::binary);

    //Get file length
    myFile.seekg(0, myFile.end);
    int fileLength = myFile.tellg();
    myFile.seekg(0, myFile.beg);

    char buffer[fileLength];

    myFile.read(buffer, fileLength);
    cout << "File length == " << fileLength << endl;
    cout << buffer << endl << endl;

    unsigned char datSha[20];
    SHA1((unsigned char *) buffer, fileLength, datSha);
    printHexRep(datSha);

    myFile.close();

    return 0;
}

Compile it like so: 像这样编译它:

g++ -o hashes info_hasher.cpp -lssl -lcrypto

And I am met with this output: 我遇到了这个输出:

4d0ca7e1599fbb658d886bddf3436e6543f58a8b

When I am expecting this output: 当我期待这个输出:

14FFE5DD23188FD5CB53A1D47F1289DB70ABF31E

Does anybody know what I might be doing wrong here? 有谁知道我在这里做错了什么? Could the problem lie with the un-readability of the end of the file? 问题可能在于文件末尾的不可读性吗? Do I need to parse this as hex first or something? 我需要先将其解析为十六进制或其他内容吗?

Make sure you don't have a newline at the end of the file, you may also want to make sure it ends with an 'e'. 确保文件末尾没有换行符,您可能还需要确保以“e”结尾。

The info-hash of a torrent file is the SHA-1 hash of the info-section (in bencoded form) from the .torrent file. torrent文件的info-hash是来自.torrent文件的info-section(以bencoded形式)的SHA-1哈希。 Essentially you need to decode the file (it's bencoded) and remember the byte offsets where the content of the value associated with the "info" key begins and end. 基本上你需要解码文件(它是bencoded)并记住字节偏移,其中与“info”键相关的值的内容开始和结束。 That's the range of bytes you need to hash. 这是您需要散列的字节范围。

For example, if this is the torrent file: 例如,如果这是torrent文件:

d4:infod6:pieces20:....................4:name4:test12:piece lengthi1024ee8:announce27:http://tracker.com/announcee

You wan to just hash this section: 你想干这个部分:

d6:pieces20:....................4:name4:test12:piece lengthi1024ee

For more information on bencoding, see BEP3 . 有关bencoding的更多信息,请参阅BEP3

SHA1 calculation is just as simple as what you've written, more or less. SHA1计算与您编写的内容一样简单,或多或少。 The error is probably in the data you're feeding it, if you get the wrong answer from the library function. 如果您从库函数中得到错误答案,则错误可能在您正在提供的数据中。

I can't speak to the torrent file prep work you've done, but I do see a few problems. 我不能说你已经完成的torrent文件准备工作,但我确实看到了一些问题。 If you'll revisit the SHA1 docs , notice the SHA1 function never requires its own digest length as a parameter. 如果您将重新访问SHA1文档 ,请注意SHA1函数永远不需要自己的摘要长度作为参数。 Next, you'll want to be quite certain the technique you're using to read the file's contents is faithfully sucking up the exact bytes, no translation. 接下来,您将非常确定您正在使用的技术来读取文件的内容是忠实地提取确切的字节,没有翻译。

A less critical style suggestion: make use of the third parameter to SHA1. 一个不太重要的样式建议:使用SHA1的第三个参数。 General rule, static storage in the library is best avoided. 一般规则,最好避免库中的静态存储。 Always prefer to supply your own buffer. 总是喜欢提供自己的缓冲区。 Also, where you have a hard-coded 20 in your print function, that's a marvelous place for that digest length constant you've been flirting with. 此外,如果您的打印功能中有20个硬编码,那么这就是您一直在调情的消化长度常数的绝佳位置。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM