使用Scala计算文件内容哈希

Question

在我们的应用程序中，我们需要计算文件哈希，因此我们可以比较文件是否稍后更新。 我现在的操作方式是使用这种小方法：

protected[services] def computeMigrationHash(toVersion: Int): String = {
    val migrationClassName = MigrationClassNameFormat.format(toVersion, toVersion)
    val migrationClass = Class.forName(migrationClassName)
    val fileName = migrationClass.getName.replace('.', '/') + ".class"
    val resource = getClass.getClassLoader.getResource(fileName)

    logger.debug("Migration file - " + resource.getFile)

    val file = new File(resource.getFile)
    val hc = Files.hash(file, Hashing.md5())

    logger.debug("Calculated migration file hash - " + hc.toString)

    hc.toString
  }

在将代码部署到不同的环境中并且文件文件位于不同的绝对路径之前，这一切都可以正常工作。 我想，散列也考虑到了路径。 计算文件内容的某种可靠散列的最佳方法是什么，因为文件内容保持不变，因此可以很好地产生与日志相同的结果？

谢谢，

Answer 1

仔细阅读源代码https://github.com/google/guava/blob/master/guava/src/com/google/common/io/Files.java- 仅对文件内容进行哈希处理- 路径不存在玩。

 public static HashCode hash(File file, HashFunction hashFunction) throws IOException {
    return asByteSource(file).hash(hashFunction);
  }

因此，您不必担心文件的位置。 现在为什么为什么要在不同的fs上使用不同的哈希值呢？。也许您应该比较大小/内容以确保例如不引入复合eol。

使用Scala计算文件内容哈希

问题描述

1 个解决方案

解决方案1
0 2016-05-18 19:35:01

使用Scala计算文件内容哈希

问题描述

1 个解决方案

解决方案1 0 2016-05-18 19:35:01

解决方案1
0 2016-05-18 19:35:01