简体   繁体   English

是否可以在C#中比较二进制文件?

[英]Is it possible to compare a binary file in c#?

I want to replace a binary file if the contents are different. 如果内容不同,我想替换一个二进制文件。

So I need to be able to compare the binary file (without having to deserialize it). 因此,我需要能够比较二进制文件(而不必反序列化)。

Is this possible? 这可能吗?

I used binary formatter to save the file. 我使用了二进制格式化程序来保存文件。

Yes it is possible. 对的,这是可能的。

You need to read the file in order to compare them, if that is what you are asking. 您需要阅读该文件以进行比较,如果您要这样的话。

The pseudo-code would be: 伪代码为:

  • Open file1 and file2 as streams. 打开文件1和文件2作为流。
  • Start by comparing length; 首先比较长度; if the length is not equal, the files are not equal. 如果长度不相等,则文件不相等。
  • Read a chunk of each file into a buffer, and compare the buffers. 将每个文件的大块读入缓冲区,然后比较缓冲区。 Repeat until you encounter differences or reach the end of the file. 重复直到遇到差异或到达文件末尾。

If you need to compare the same file to a bunch of other files, it can be useful to calculate the hash of the first file. 如果需要将同一文件与其他文件进行比较,则计算第一个文件的哈希值可能很有用。 Then just calculate the hash of each of the other files, and compare the hashes. 然后,只需计算其他文件的哈希值,然后比较哈希值即可。

Yes, you can generate the MD5 or SHA1 hash for each set of file data and then compare them. 是的,您可以为每组文件数据生成MD5SHA1哈希,然后进行比较。

Sample code (error checking removed for clarity): 示例代码(为清楚起见,删除了错误检查):

public bool CompareFiles(string filePath1, string filePath2)
{

  FileInfo info1 = new FileInfo(filePath1);
  FileInfo info2 = new FileInfo(filePath2);


  byte[] data1 = new byte[info1.Length]
  byte[] data2 = new byte[info2.Length]; 

  FileStream fs1 = new FileStream(filePath1, FileMode.Open);
  FileStream fs2 = new FileStream(filePath2, FileMode.Open);

  fs1.Read(data1, 0, info1.Length);
  fs2.Read(data2, 0, info2.Length);

  fs1.Dispose();
  fs2.Dispose();

  SHA1 sha = new SHA1CryptoServiceProvider(); 

  byte[] hash1 = sha.ComputeHash(data1);
  byte[] hash2 = sha.ComputeHash(data2);

  // c# 2 or less: you need to compare the hash bytes yourself

  // c# 3.5/4
  bool result = hash1.SequenceEqual(hash2);

  return result;
}
byte[] myFile = File.ReadAllBytes(pathToFile);

Then loop through it. 然后循环遍历。 Might be slow if the file is large. 如果文件很大,可能会变慢。

Perhaps you should look for a file MD5 hash algorithm 也许您应该寻找文件MD5哈希算法

You can read binary content of the file and compare the bytes you get. 您可以读取文件的二进制内容并比较获取的字节。 To read the file you can either use ReadAllBytes (if the file is reasonably sized and will fit to the memory comfortably) or you can use FileStream and read chunks of data from both files. 要读取文件,您可以使用ReadAllBytes (如果文件大小合理并且可以舒适地适合内存),也可以使用FileStream并从两个文件中读取数据块。

The structure of the approach using buffers might look like this: 使用缓冲区的方法的结构可能如下所示:

byte[] buffer1 = new byte[1024], buffer2 = new byte[1024];
using(var fs1 = new FileStream(firstFile, FileMode.Open, FileAccess.Read)
using(var fs2 = new FileStream(secondFile, FileMode.Open, FileAccess.Read)
{
  // Use: fs.Read(buffer1, 0, 1024) to repeatedly read 1kb of data
  // from both fs1 and fs2 and compare the content in buffer1 and buffer2
}

Some people recommended using hashes, but that's not a good idea - if the files are the same, you'll need to read all data from the file, so calculating hashes isn't more efficient then simply reading and comparing all data. 有人建议使用散列,但这不是一个好主意-如果文件相同,则需要从文件中读取所有数据,因此计算散列并没有比简单地读取和比较所有数据更有效。 However, if the files differ in the first few bytes, you'll need to read only first few bytes (if comparing byte-by-byte)! 但是,如果文件的前几个字节有所不同,则只需读取前几个字节(如果逐字节比较)!

Hashes would be useful if you wanted to compare multiple files (eg each with each). 如果您想比较多个文件(例如,每个文件),则哈希将很有用。

Here is a function to do it. 这是执行此操作的功能。 Unless somone else can provide a better way to compare byte arrays. 除非有其他人可以提供比较字节数组的更好方法。

private static bool CompareFiles(string file1, string file2)
{
    var fsFile1 = new System.IO.FileStream(file1, System.IO.FileMode.Open, System.IO.FileAccess.Read);
    var fsFile2 = new System.IO.FileStream(file2, System.IO.FileMode.Open, System.IO.FileAccess.Read);
    var md5 = new System.Security.Cryptography.MD5Cng();
    var md5File1 = md5.ComputeHash(fsFile1);
    var md5File2 = md5.ComputeHash(fsFile2);
    for (int i = 0; i < md5File1.Length; ++i)
    {
        if (md5File1[i] != md5File2[i])
            return false;
    }
    return true;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM