Python MD5 散列相同的内容返回不同的散列

Question

I am writing a python program, because I am lazy, that checks a website for a job opening I have been told about and returns all the jobs the companies web page.我正在编写一个 python 程序，因为我很懒惰，它会检查一个网站是否有我被告知的职位空缺，并返回公司网页上的所有职位。

Here is my code so far (yes I know the code is jancky however I am just trying to get it working)到目前为止，这是我的代码（是的，我知道代码很笨拙，但我只是想让它工作）

import requests
from bs4 import BeautifulSoup
import sys
import os
import hashlib

reload(sys)
sys.setdefaultencoding('utf8')

res = requests.get('WEBSITE URL', verify=False)
res.raise_for_status()

filename = "JobWebsite.txt"

def StartUp():
    if not os.path.isfile(filename):
        try:
            jobfile = open(filename, 'a')
            jobfile = open(filename, 'r+')
            print("[*] Succesfully Created output file")
            return jobfile
        except:
            print("[*] Error creating output file!")
            sys.exit(0)
    else:
         try:
             jobfile = open(filename, 'r+')
             print("[*] Succesfully Opened output file")
             return jobfile
         except:
             print("[*] Error opening output file!")
             sys.exit(0)

 def AnyChange(htmlFile):
    fileCont = htmlFile.read()
    FileHash = hasher(fileCont, "File Code Hashed")
    WebHash = hasher(res.text, "Webpage Code Hashed")
    !!!!! Here is the Problem
    print ("[*] File hash is " + str(FileHash))
    print ("[*] Website hash is " + str(WebHash))
    if FileHash == WebHash:
        print ("[*] Jobs being read from file!")
        num_of_jobs(fileCont)
    else:
        print("[*] Jobs being read from website!")
        num_of_jobs(res.text)
        deleteContent(htmlFile)
        writeWebContent(htmlFile, res.text)

def hasher(content, message):
    content = hashlib.md5(content.encode('utf-8'))
    return content

def num_of_jobs(htmlFile):
    content = BeautifulSoup(htmlFile, "html.parser")
    elems = content.select('.search-result-inner')
    print("[*] There are " + str(len(elems)) + " jobs available!")

def deleteContent(htmlFile):
    print("[*] Deleting Contents of local file! ")
    htmlFile.seek(0)
    htmlFile.truncate()

def writeWebContent(htmlFile, content):
    htmlFile = open(filename, 'r+')
    print("[*] Writing Contents of website to file! ")
    htmlFile.write(content.encode('utf-8'))

jobfile = StartUp()
AnyChange(jobfile)

The problem I currently have is that I hash both of the websites html code and the files html code.我目前遇到的问题是我对网站 html 代码和文件 html 代码进行了哈希处理。 However both of the hashes don't match, like ever, I am not sure and can only guess that it might be something with the contents being save in a file.然而，两个哈希值都不匹配，就像以往一样，我不确定，只能猜测它可能是内容保存在文件中的东西。 The hashes aren't too far apart but it still causes the If statement to fail each time散列不是太远，但它仍然导致 If 语句每次都失败

Breakpoint in Program with hashes程序中带有散列的断点

Answer 1

The screenshot you have attached is showing the location of the two hash objects fileHash and webHash .您附加的屏幕截图显示了两个哈希对象fileHash和webHash 。 They should be in different locations.他们应该在不同的位置。

What you really want to compare is the hexdigest() of the two hash objects.您真正想要比较的是两个哈希对象的hexdigest() 。 Change your if statement to:将您的if语句更改为：

if FileHash.hexdigest() == WebHash.hexdigest():
        print ("[*] Jobs being read from file!")
        num_of_jobs(fileCont)

Take a look at this other StackOverflow answer for some more how-to.查看其他 StackOverflow 答案以了解更多操作方法。

Python MD5 散列相同的内容返回不同的散列

问题描述

1 个解决方案

解决方案1
1 2017-10-11 22:46:14

Python MD5 散列相同的内容返回不同的散列

问题描述

1 个解决方案

解决方案1 1 2017-10-11 22:46:14

解决方案1
1 2017-10-11 22:46:14