简体   繁体   English

判断两个文件是否相同的最佳方法?

[英]Best way to tell if two files are the same?

If I transfer a file over the internet from computer A to computer B using C# using one of the many ways to do file transfers with .NET, What is the best way to tell if the file on computer A and computer B is the same? 如果我使用C#通过互联网将文件从计算机A传输到计算机B,使用.NET中的一种方法进行文件传输,那么判断计算机A和计算机B上的文件是否相同的最佳方法是什么?

I am thinking that MD5 hashes would be a good way to tell. 我在想MD5哈希是一个很好的方式。 It seems like that is a widely accepted way to tell. 这似乎是一种被广泛接受的方式。 However, I am just double checking to see if there is not a better way out there hidden in the .NET framework. 但是,我只是仔细检查一下,看看.NET框架中是否有更好的方法。

Thank you Tony 谢谢托尼

MD5是要走的路。

CRC32 or Adler32, which are a lot faster then MD5. CRC32或Adler32,比MD5快很多。 You should use MD5 if you need to check if file was manipulated with malicious intent. 如果需要检查文件是否被恶意操纵,则应使用MD5。 If there is no need to, than it's overkill. 如果没有必要,那就太过分了。

SHA2 (SHA256, SHA512) algorithms are better than MD5 for many reasons. 由于多种原因,SHA2(SHA256,SHA512)算法优于MD5。

  • They are more resistant to collisions, an important concern for large files. 它们更能抵抗碰撞,这是大文件的一个重要问题。 While MD5 can detect changes to the content of the file, it is more likely that two large files can end up having the same MD5. 虽然MD5可以检测到文件内容的更改,但两个大文件最终可能最终具有相同的MD5。
  • They are FASTER to compute. 它们更快速地计算。 This may seem strange, but SHA algorithm are accelerated both by chipsets and OS implementations. 这可能看起来很奇怪,但是SHA算法会被芯片组和操作系统实现加速。 The algorithms themselves are easier to parallelize as well. 算法本身也更容易并行化。 As a result, native implementations of SHA or SHA2 algorithms in Vista+ are much faster than the equivalent MD5 algorithm. 因此,Vista +中SHA或SHA2算法的本机实现比等效的MD5算法快得多。
  • They use a larger block size which means they can work on large file blocks at a time. 它们使用更大的块大小,这意味着它们可以一次处理大型文件块。 I/O time can add up when processing large files. 处理大文件时,I / O时间会增加。

The native implementations in .NET are SHA256Cng , SHA384Cng and SHA512Cng. .NET中的本机实现是SHA256Cng ,SHA384Cng和SHA512Cng。 Instead of instantiating them explicitly, you can define them as the default algorithm to use when hashing using the < cryptoClass > element in your configuration file. 您可以将它们定义为使用配置文件中的<cryptoClass>元素进行散列时使用的默认算法,而不是显式地实例化它们。

After you do that, you can just write HashAlgorithm.Create() or SHA256256.Create() to create the native instance. 执行此操作后,您可以编写HashAlgorithm.Create()SHA256256.Create()来创建本机实例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM