逐行阅读.txt单词列表时，如何获取正确的哈希值？

Question

I'm trying to build a Python3.x script that reads a .txt wordlist and convert the word on every line to its hashed equivalent however when I execute this script it produces a wrong hash. 我正在尝试构建一个Python3.x脚本，该脚本读取.txt单词列表，并将每一行上的单词转换为其哈希等效项，但是当我执行此脚本时，它将产生错误的哈希值。

Hope you guys can help me figure out what I'm doing wrong here.. 希望你们能帮助我弄清楚我在做什么错..

Output 输出量

Arguments passed to the program:
Namespace(inputHashType=['md5'], verbose=True, 
    wordlist=_io.TextIOWrapper name='C:\\Users\\Mikael\\Desktop\\wordlist.txt' mode='rt' encoding='utf-8')
Verbose is set to: True

correct hash:  b61a6d542f9036550ba9c401c80f00ef
Line 1:  PT: tests      As hash: a58c6e40436bbb090294218b7d758a15

Example of inputfile: 输入文件示例：

tests
tests1
tests2

Source Code 源代码

import argparse
import sys
from Crypto.Hash import MD5, SHA1, SHA224, SHA256, SHA384, SHA512


parser = argparse.ArgumentParser(description='Hash production')
parser.add_argument('-v', action='store_true', dest='verbose', default=False, help='Print attempts')
parser.add_argument('-t', nargs=1, dest='inputHashType', help='Hash type')
parser.add_argument('-d', nargs='?', dest='wordlist', type=argparse.FileType('rt', encoding='utf-8'), default=sys.stdin, help='Dictionary (as file)')
args =  parser.parse_args()

inputHashType = ''.join(map(str, args.inputHashType)) # Formats args list as string
inputHashType.lower()

if inputHashType == 'md5':
    htype = MD5.new()

try:
    if args.verbose:
        with args.wordlist as file:
            line = file.readline()
            cnt = 1
            while line:
                word = line.encode('utf-8').rstrip()
                hashed = htype.update(word)
                hashed = htype.hexdigest()
                print("Line {}:  PT: {}      As hash: {}".format(cnt, line.strip(), hashed))
                line = file.readline()
                cnt += 1
    else:
        break
except:
    print('Error')

Answer 1

The problem is that in the try block of your code, you're re-using the MD5 hash evaluator for each new line by the update() method. 问题在于，在代码的try块中，您通过update()方法为每行新使用了MD5哈希评估器。 This does not calculate the hash value for that input string, but accumulates the input and evaluates the hash of accumulated strings up to that point. 这不会计算该输入字符串的哈希值，但会累加输入并评估直到该点为止的累加字符串的哈希值。

It's easy to see this is what is happening by using md5sum : 可以很容易地看到这是通过使用md5sum发生的：

$ echo -n 'tests' | md5sum
b61a6d542f9036550ba9c401c80f00ef  -    # Identical to your 1st output line
$ echo -n 'teststests' | md5sum         # This is what you're calculating
a58c6e40436bbb090294218b7d758a15  -    # Identical to your 2nd output line.

To evaluate the hash value for each new input, you'll need to re-initialize a new MD5 instance by calling the new() method. 要评估每个新输入的哈希值，您需要通过调用new()方法来重新初始化一个新的MD5实例。

逐行阅读.txt单词列表时，如何获取正确的哈希值？

问题描述

Output 输出量

Example of inputfile: 输入文件示例：

Source Code 源代码

1 个解决方案

解决方案1
4 已采纳 2018-02-16 10:28:37

逐行阅读.txt单词列表时，如何获取正确的哈希值？

问题描述

Output 输出量

Example of inputfile: 输入文件示例：

Source Code 源代码

1 个解决方案

解决方案1 4 已采纳 2018-02-16 10:28:37

解决方案1
4 已采纳 2018-02-16 10:28:37