简体   繁体   English

创建种子哈希信息

[英]Create torrent hash info

how do I generate torrent hash info on torrent files. 如何在种子文件中生成种子哈希信息。

I have been taking a look on this example: How to calculate the hash value of a torrent using Java and am trying to convert it to C++. 我一直在研究以下示例: 如何使用Java计算torrent的哈希值,并尝试将其转换为C ++。 This is the code I have so far: 这是我到目前为止的代码:

void At::ReadTorrent::TorrentParser::create_hash(std::string torrentstub)
{
    std::string info;
    int counter = 0;

    while(info.find("4:info") == -1)
    {
        info.push_back(torrentstub[counter]);
        counter++;
    }

    unsigned char array[torrentstub.size()];
    int test = 0;

    for(int data; (data = torrentstub[counter]) > -1;)
    {
         array[test++] = data;
         counter++;
    }
    std::cout << array << std::endl;

    //SHA-1 some value here to generate the hash.
}

The torrentstub parameter is the torrent file represented as a string. torrentstub参数是表示为字符串的torrent文件。 As far as I understand I have to get the information that is coming after 4:info . 据我了解,我必须获取4:info之后的4:info This works okay I think, for example: 我认为这行得通,例如:

d6:lengthi2847431620e4:name8:filename12:piece lengthi1143252e6:pieces50264

After this there is only information that I can't read, I guess this is some binary data? 之后,只有我无法读取的信息,我想这是一些二进制数据吗?

So my question actually boils down to be: Is the information that should be hashed everything that comes after 4:info , and where should I stop collecting data for the hash? 因此,我的问题实际上可以归结为:应该对信息进行哈希处理4:info之后的所有内容,我应该在哪里停止收集哈希数据呢?

The sample code you based this seems to assume the info key is the last thing in the torrent file (it may not be, so read the entire answer to get the whole story). 您以此为基础的示例代码似乎假设info键是torrent文件中的最后一件事(可能不是,因此,请阅读完整答案以获取全文)。 As such, it would cover the remainder of the file (minus 1 byte) starting at the byte following ":info". 这样,它将覆盖文件的其余部分(减去1个字节),从“:info”之后的字节开始。 You would see something like "...:infod6:length...". 您会看到类似“ ...:infod6:length ...”的信息。 The SHA1 starts with "d6:length..." and goes to the end of the file minus 1 byte (last byte, usually 'e', is not included). SHA1以“ d6:length ...”开头,并到达文件末尾减去1个字节(不包括最后一个字节,通常为'e')。

For example, if the torrent file is 43125 bytes and ":info" starts at offset 362, then the SHA data starts at offset 367 and continues to offset 43123 (that is, it's 42757 bytes). 例如,如果Torrent文件为43125字节,并且“:info”从偏移量362开始,则SHA数据将从偏移量367开始,并继续偏移量43123(即42757字节)。

You may know that your torrent files indeed end with the info key. 您可能知道您的torrent文件确实以info键结尾。 If you don't know, then your algorithm must be a little more sophisticated. 如果您不知道,那么您的算法必须更加复杂。 A torrent file is bencoded and the info key consists of a bencode "dictionary" (search for bencode in Wikipedia and read the article-- it's pretty simple to understand). 将对torrent文件进行bencode编码,而info键则包含一个bencode“ dictionary”(在Wikipedia中搜索bencode并阅读该文章,这很容易理解)。 The "d" following the ":info" starts the dictionary which ends with an "e". “:info”后面的“ d”开始以“ e”结尾的字典。 The length of the dictionary is not encoded, so the only way to know where it ends is to parse the contents until you find the "e" that ends it. 字典的长度没有经过编码,因此,知道字典结尾的唯一方法是解析内容,直到找到结尾的“ e”为止。 If the file is correctly formatted the contents of the dictionary will consist of a series of well-formatted bencoded elements (and further-nested elements). 如果文件格式正确,则词典的内容将由一系列格式良好的Bencoded元素(以及进一步嵌套的元素)组成。 Eventually you will find an "e" following the end of an element (instead of another element). 最终,您会在一个元素(而不是另一个元素)的末尾找到一个“ e”。 This "e" ends the dictionary. 此“ e”以字典结尾。 The SHA1 is over the entire contents of this dictionary, including the opening "d" and the closing "e". SHA1在此词典的整个内容上,包括开头“ d”和结尾“ e”。 It is possible for other bencoded elements to follow this. 其他被编码的元素也可能遵循此规则。 These are NOT included in the SHA1 calculation. 这些不包括在SHA1计算中。

Misc. 杂项。 notes: 笔记:

Assuming the info key is the last thing in the file (again, it may not be), the single byte that is "left out" of the SHA1 in your algorithm is the final "e" for the entire torrent (which is just a single bencode dictionary-- all torrent files begin with "d" and end with "e"). 假设信息密钥是文件中的最后一件事(再次可能不是),则算法中SHA1“遗漏”的单个字节是整个torrent的最后一个“ e”(只是一个单个Bencode字典-所有种子文件均以“ d”开头,以“ e”结尾)。

This is binary data, so you must read it as such when filling torrentstub[]. 这是二进制数据,因此在填充torrentstub []时必须照此读取。

You cannot test for -1 to determine when to end as you do in your example. 您无法像示例中那样测试-1以确定何时结束。 The code it is based on looks at the result of the read operation when testing for -1 (eof), not the data itself. 它基于的代码在测试-1(eof)而不是数据本身时会查看读取操作的结果。 You must use the length of the torrent file, minus the start of the data (after ":info") minus 1 to get the right length. 您必须使用种子文件的长度减去数据的开头(“:info”之后)减去1才能获得正确的长度。

The sample code you reference actually does read the last byte but excludes it when generating the SHA1. 您引用的示例代码实际上确实读取了最后一个字节,但在生成SHA1时将其排除在外。

Reading one byte, copying to the string then re-scanning the string repeatedly is very inefficient. 读取一个字节,复制到字符串,然后重复重新扫描字符串是非常低效的。 You already have the data in an array, so just use strstr (since the beginning is ASCII data) or scan it yourself (not too hard to just code it since it's a very short, fixed-length string). 您已经在数组中存储了数据,因此只需使用strstr(因为开头是ASCII数据)或自己扫描(因为它是一个非常短的固定长度的字符串,所以不难编码)。

I assume you have code to do the actual SHA1. 我假设您有执行实际SHA1的代码。 What platform are you working on? 您在哪个平台上工作?

The .torrent spec is freely available and should help you understand the file format quite easily. .torrent规范是免费提供的 ,应该可以帮助您轻松地了解文件格式。 All you need to do is SHA1 the contents of the info key to get the info hash. 您所需要做的就是SHA1 info键的内容以获取信息哈希。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM