简体   繁体   English

在SQL Server 2005中使用MD5在varbinary上执行校验和文件

[英]Using MD5 in SQL Server 2005 to do a checksum file on a varbinary

Im trying to do a MD5 check for a file uploaded to a varbinary field in MSSQL 2005. 我试图对上传到MSSQL 2005中的varbinary字段的文件进行MD5检查。

I uploaded the file and using 我上传了文件并使用了

SELECT DATALENGTH(thefile) FROM table

I get the same number of bytes that the file has. 我得到的文件数与文件数相同。

But using MD5 calculator (from bullzip) i get this MD5: 但是使用MD5计算器(来自bullzip)我得到了这个MD5:

20cb960d7b191d0c8bc390d135f63624

and using SQL I get this MD5: 并使用SQL我得到这个MD5:

44c29edb103a2872f519ad0c9a0fdaaa

Why they are different if the field has the same lenght and so i assume the same bytes? 如果字段具有相同的长度,为什么它们是不同的,所以我假设相同的字节?

My SQL Code to do that was: 我的SQL代码是:

DECLARE @HashThis varbinary;
DECLARE @md5text varchar(250);
SELECT  @HashThis = thefile FROM CFile WHERE id=1;

SET @md5text = SUBSTRING(sys.fn_sqlvarbasetostr(HASHBYTES('MD5',@HashThis)),3,32)
PRINT @md5text;

Maybe the data type conversion? 也许数据类型转换?

Any tip will be helpful. 任何提示都会有所帮助。

Thanks a lot :) 非常感谢 :)

Two options 两种选择

  1. VARBINARY type without size modifier utilizes VARBINARY(1), so you are hashing the very 1st byte of file, SELECT DATALENGTH(@HashThis) after assignment will bring to you 1 没有大小修饰符的VARBINARY类型使用VARBINARY(1),所以你在文件的第一个字节哈希, SELECT DATALENGTH(@HashThis)赋值给你1
  2. If you use varbinary(MAX) instead - then keep in mind, that HASHBYTES hashes only first 8000 bytes of input 如果你使用varbinary(MAX) - 那么请记住,HASHBYTES仅哈希输入前8000字节

If you want to perform hashing more than 8000 bytes - write your own CLR hash function, for example the file is from my sql server project, it brings the same results as other hash functions outside of sql server: 如果你想执行超过8000字节的散列 - 编写你自己的CLR散列函数,例如该文件来自我的sql server项目,它带来与sql server之外的其他散列函数相同的结果:

using System;
using System.Data.SqlTypes;
using System.IO;

namespace ClrHelpers
{
    public partial class UserDefinedFunctions {
        [Microsoft.SqlServer.Server.SqlFunction]
        public static Guid HashMD5(SqlBytes data) {
            System.Security.Cryptography.MD5CryptoServiceProvider md5 = new System.Security.Cryptography.MD5CryptoServiceProvider();
            md5.Initialize();
            int len = 0;
            byte[] b = new byte[8192];
            Stream s = data.Stream;
            do {
                len = s.Read(b, 0, 8192);
                md5.TransformBlock(b, 0, len, b, 0);
            } while(len > 0);
            md5.TransformFinalBlock(b, 0, 0);
            Guid g = new Guid(md5.Hash);
            return g;
        }
    };
}

It can be that MD5 Calculator is making the MD5 Hash of file content + other properties (ex: author, last process date, etc.). 可能是MD5 Calculator正在制作文件内容的MD5哈希+其他属性(例如:作者,最后进程日期等)。 You may try to do alter these properties and make another hash to see if the result is equal (between before and after using only MD5 Calculator). 您可以尝试更改这些属性并创建另一个哈希以查看结果是否相等(仅使用MD5计算器之前和之后)。

Another possibility is about what are you really saving in SQL Server.. 另一种可能性是你在SQL Server中真正节省了什么。


So, it's quite clear, MD5 Calculator and SQL Server are hashing different things. 所以,很明显,MD5 Calculator和SQL Server正在散列不同的东西。 What? 什么? I give a congratz to who answers it :) 我恭喜谁回答:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM