简体   繁体   English

在JAVA中使用MD5比较文件的内容

[英]Comparing the content of a file using MD5 in JAVA

I am presently comparing the lsit of files using MD5sum. 我目前正在比较使用MD5sum的文件的lsit。 How to group similar kind of files into a folder using these hash values? 如何使用这些哈希值将同类文件分组到文件夹中? Will the hash difference between the two files will be less? 两个文件之间的哈希差异是否会更小?

For example: I am having a file which contains a name "HELLO" and the other pdf file contains "hello", these both are more or less same. 例如:我有一个包含名称“ HELLO”的文件,另一个pdf文件包含“ hello”,这两者或多或少是相同的。 so these files needs to be grouped. 因此,这些文件需要分组。 will my idea of finding hash difference help? 我发现哈希差异的想法会有所帮助吗?

Or any other idea? 还是其他想法? Please help me to sort this out. 请帮我解决这个问题。

No. The hashes will be completely different and there will be no correlation. 不会。哈希值将完全不同,并且将不相关。 You can use hashes if you want to divide them uniformly into different buckets, but it doesn't work with grouping similar files. 如果要将散列统一地划分到不同的存储桶中,可以使用散列,但不适用于将相似文件分组。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM