简体   繁体   中英

How to grab all files in a folder and get their MD5 hash in python?

I'm trying to write some code to get the md5 of every exe file in a folder.

My problem is that I don't understand how to do it. It works only if the folder contains only one file. This is my code:

import glob
import hashlib
file = glob.glob("/root/PycharmProjects/untitled1/*.exe")

newf = str (file)
newf2 =  newf.strip( '[]' )
newf3 = newf2.strip("''")

with open(newf3,'rb') as getmd5:
    data = getmd5.read()
    gethash= hashlib.md5(data).hexdigest()
    print gethash

And I get the result:

a7f4518aae539254061e45424981e97c

I want to know how I can do it to more than one file in the folder.

glob.glob returns a list of files. Just iterate over the list using for :

import glob
import hashlib

filenames = glob.glob("/root/PycharmProjects/untitled1/*.exe")

for filename in filenames:
    with open(filename, 'rb') as inputfile:
        data = inputfile.read()
        print(filename, hashlib.md5(data).hexdigest())

Notice that this can potentially exhaust your memory if you happen to have a large file in that directory, so it is better to read the file in smaller chunks (adapted here for 1 MiB blocks):

def md5(fname):
    hash_md5 = hashlib.md5()
    with open(fname, "rb") as f:
        for chunk in iter(lambda: f.read(2 ** 20), b""):
            hash_md5.update(chunk)
    return hash_md5.hexdigest()

for filename in filenames:
    print(filename, md5(filename))

I think in the end, you're opening only one empty file. The reason for that is that you take the list returned by glob and remove the list markers in its string representation (and only at both ends of the string as you use strip . This gives you something like:

file1.exe' 'file2.exe' 'file3.exe

You then give this string to open that will try to open a file called like that. In fact, I'm even surprised it works (unless you have only one file) ! You should get a FileNotFoundError .

What you want to do is iterate on all the files returned by glob.glob :

import glob
import hashlib
file = glob.glob("/root/PycharmProjects/untitled1/*.exe")

for f in file:
    with open(f, 'rb') as getmd5:
        data = getmd5.read()
        gethash = hashlib.md5(data).hexdigest()
        print("f: " + gethash)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM