简体   繁体   English

python将a的输出保存为迭代和子过程的校验和

[英]python saving output from a for iteration and subprocess for checksum

The purpose of this script is to pull md5 checksum from each file of a directory as source and then (I'm working on that also) execute the script on the destination so validate it has copied correctly. 该脚本的目的是从目录的每个文件中提取md5校验和作为源,然后(我也在努力)在目标位置执行脚本,以验证其是否已正确复制。

#!/usr/bin/env python

import os
from sys import *
import subprocess


script, path = argv

destination = "./new_directorio/"
archivo = "cksum.txt"


def checa_sum(x):
        ck = "md5 %s" % x
        p = subprocess.Popen(ck, stdout=subprocess.PIPE, shell=True)
        (output, err) = p.communicate()

        out = open(archivo,'w')
        out.write("%s" % (output))
        out.close()

files = [f for f in os.listdir(path) if os.path.isfile(f)]
for i in files:
        if not "~" in i:
                checa_sum(i)

What gives me is a file called: "cksum.txt" but only one result inside the file. 给我的是一个名为“ cksum.txt”的文件,但文件中只有一个结果。

bash-3.2$ more cksum.txt
MD5 (victor) = 4703ee63236a6975abab75664759dc29
bash-3.2$ 

An other try, instead of "open", "write", "close" structure is using the following: 另一种尝试使用以下方法,而不是“ open”,“ write”,“ close”结构:

def checa_sum(x):
            ck = "md5 %s" % x
            p = subprocess.Popen(ck, stdout=subprocess.PIPE, shell=True)
            (output, err) = p.communicate()

             with open(archivo,'w') as outfile:
                   outfile.write(output)

Why is only dropping me one result when I expect the following result in the file?: 当我期望文件中有以下结果时,为什么只给我一个结果?:

MD5 (pysysinfo.py) = 61a532c898e6f461ef029cee9d1b63dd

MD5 (pysysinfo_func.py) = ac7a1c1c43b2c5e20ceced5ffdecee86

MD5 (pysysinfo_new.py) = 38b06bac21af3d08662d00fd30f6c329

MD5 (test) = b2b0c958ece30c119bd99837720ffde1

MD5 (test_2.py) = 694fb14d86c573fabda678b9d770e51a

MD5 (uno.txt) = 466c9f9d6a879873688b000f7cbe758d

MD5 (victor) = 4703ee63236a6975abab75664759dc29

Moreover, I don't know how to tackle the space between each iteration. 而且,我不知道如何解决每次迭代之间的空间。 I'm looking for that too. 我也在寻找那个。

After having this, I'm going to compare each item to verify the integrity once is copied to the destination. 完成此操作后,一旦复制到目标位置后,我将比较每个项目以验证完整性。

You keep opening with w and overwriting , open with a to append. 您继续使用w打开并覆盖,使用a打开以追加。

The best way is to simply redirect stdout to a file object, something like: 最好的方法是将stdout重定向到文件对象,例如:

def checa_sum(x):
    with open(archivo,'a') as outfile:
        check_call(["md5",x], stdout=outfile)

using check_call will raise a CalledProcessError for a non-zero exit status which you should handle accordingly. 使用check_call将针对非零退出状态引发CalledProcessError ,您应据此进行处理。

To catch the exception: 捕获异常:

  try:
     check_call(["md5sum", x], stdout=outfile)
  except CalledProcessError as e:
     print("Exception for {}".format(e.cmd))

Use a generator expression to get the files and if you want to ignore copies use not f.endswith("~") : 使用生成器表达式获取文件,如果要忽略副本,请not f.endswith("~")使用not f.endswith("~")

files = (f for f in os.listdir("/home/padraic") if os.path.isfile(f) and not f.endswith("~"))
for i in files:
    checa_sum(i)

ah, someone asked for alternatives, there are of course :) 嗯,有人要求替代品,当然有:)

import logging
import hashlib
import os
outfile = "hash.log"
indir = "/Users/daniel/Sites/work"
logging.basicConfig(filename=outfile, filemode="w", format='%(message)s', level=logging.DEBUG)
for filename in (file for file in os.listdir(indir) if os.path.isfile(file) and not file.endswith("~")):
    with open(filename) as checkfile:
        logging.info(hashlib.md5(checkfile.read()).hexdigest())

i've been using something like this before. 我以前一直在使用这样的东西。

what i like is using the logging module, because it makes things scalable, i don't have to keep a file open, or keep on opening it. 我喜欢使用日志记录模块,因为它使事情变得可扩展,我不必保持打开文件的状态,也不必继续打开文件。 the logger is highly configurable, but for just generating something like needed here, the simple setup is a one liner. 记录器是高度可配置的,但是对于仅生成此处所需的内容而言,简单的设置就是一口气。

here i am not doing any console parsing, because i am using pythons hashlib to generate the file md5. 在这里,我没有进行任何控制台解析,因为我使用的是python hashlib来生成文件md5。 now one could say, doing this could be slowing things down, but at least for the file sizes i usually encounter i had no problems so far. 现在有人可以说,这样做可能会减慢速度,但是至少对于我通常遇到的文件大小,到目前为止,我还没有遇到任何问题。

would be interesting to test on larger files, otherwise the logging mechanism could also be used in your case. 在较大的文件上进行测试将很有趣,否则在您的情况下也可以使用日志记录机制。 i only preferred hashlib back then, because i did not fancy parsing console output. 那时我只喜欢hashlib,因为我不喜欢解析控制台输出。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM