简体   繁体   English

如何使用javascript计算文件的md5哈希

[英]How to calculate md5 hash of a file using javascript

有没有办法在使用 Javascript 上传到服务器之前计算文件的 MD5 哈希值?

While there are JS implementations of the MD5 algorithm, older browsers are generally unable to read files from the local filesystem .虽然有 MD5 算法的JS 实现,但较旧的浏览器通常无法从本地文件系统读取文件

I wrote that in 2009. So what about new browsers?我是在 2009 年写的。那么新的浏览器呢?

With a browser that supports the FileAPI , you can read the contents of a file - the user has to have selected it, either with an <input> element or drag-and-drop.使用支持FileAPI的浏览器,您可以读取文件的内容- 用户必须选择它,无论是使用<input>元素还是拖放。 As of Jan 2013, here's how the major browsers stack up:截至 2013 年 1 月,以下是主要浏览器的排列方式:

How?如何?

See the answer below by Benny Neugebauer which uses the MD5 function of CryptoJS请参阅下面使用CryptoJSMD5 函数的Benny Neugebauer回答

I've made a library that implements incremental md5 in order to hash large files efficiently.我制作了一个实现增量 md5 的库,以便有效地散列大文件。 Basically you read a file in chunks (to keep memory low) and hash it incrementally.基本上,您分块读取文件(以保持低内存)并逐步散列它。 You got basic usage and examples in the readme.您在自述文件中获得了基本用法和示例。

Be aware that you need HTML5 FileAPI, so be sure to check for it.请注意,您需要 HTML5 FileAPI,所以一定要检查它。 There is a full example in the test folder. test 文件夹中有一个完整的示例。

https://github.com/satazor/SparkMD5 https://github.com/satazor/SparkMD5

it is pretty easy to calculate the MD5 hash using the MD5 function of CryptoJS and the HTML5 FileReader API .使用CryptoJSMD5 函数HTML5 FileReader API计算 MD5 哈希值非常容易。 The following code snippet shows how you can read the binary data and calculate the MD5 hash from an image that has been dragged into your Browser:以下代码片段显示了如何读取二进制数据并从已拖入浏览器的图像中计算 MD5 哈希值:

var holder = document.getElementById('holder');

holder.ondragover = function() {
  return false;
};

holder.ondragend = function() {
  return false;
};

holder.ondrop = function(event) {
  event.preventDefault();

  var file = event.dataTransfer.files[0];
  var reader = new FileReader();

  reader.onload = function(event) {
    var binary = event.target.result;
    var md5 = CryptoJS.MD5(binary).toString();
    console.log(md5);
  };

  reader.readAsBinaryString(file);
};

I recommend to add some CSS to see the Drag & Drop area:我建议添加一些 CSS 以查看拖放区域:

#holder {
  border: 10px dashed #ccc;
  width: 300px;
  height: 300px;
}

#holder.hover {
  border: 10px dashed #333;
}

More about the Drag & Drop functionality can be found here: File API & FileReader可以在此处找到有关拖放功能的更多信息: File API & FileReader

I tested the sample in Google Chrome Version 32.我在 Google Chrome 版本 32 中测试了示例。

HTML5 + spark-md5 and Q HTML5 + spark-md5Q

Assuming your'e using a modern browser (that supports HTML5 File API), here's how you calculate the MD5 Hash of a large file (it will calculate the hash on variable chunks)假设您使用现代浏览器(支持 HTML5 文件 API),以下是计算大文件MD5 哈希的方法(它将计算可变块的哈希)

 function calculateMD5Hash(file, bufferSize) { var def = Q.defer(); var fileReader = new FileReader(); var fileSlicer = File.prototype.slice || File.prototype.mozSlice || File.prototype.webkitSlice; var hashAlgorithm = new SparkMD5(); var totalParts = Math.ceil(file.size / bufferSize); var currentPart = 0; var startTime = new Date().getTime(); fileReader.onload = function(e) { currentPart += 1; def.notify({ currentPart: currentPart, totalParts: totalParts }); var buffer = e.target.result; hashAlgorithm.appendBinary(buffer); if (currentPart < totalParts) { processNextPart(); return; } def.resolve({ hashResult: hashAlgorithm.end(), duration: new Date().getTime() - startTime }); }; fileReader.onerror = function(e) { def.reject(e); }; function processNextPart() { var start = currentPart * bufferSize; var end = Math.min(start + bufferSize, file.size); fileReader.readAsBinaryString(fileSlicer.call(file, start, end)); } processNextPart(); return def.promise; } function calculate() { var input = document.getElementById('file'); if (!input.files.length) { return; } var file = input.files[0]; var bufferSize = Math.pow(1024, 2) * 10; // 10MB calculateMD5Hash(file, bufferSize).then( function(result) { // Success console.log(result); }, function(err) { // There was an error, }, function(progress) { // We get notified of the progress as it is executed console.log(progress.currentPart, 'of', progress.totalParts, 'Total bytes:', progress.currentPart * bufferSize, 'of', progress.totalParts * bufferSize); }); }
 <script src="https://cdnjs.cloudflare.com/ajax/libs/q.js/1.4.1/q.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/spark-md5/2.0.2/spark-md5.min.js"></script> <div> <input type="file" id="file"/> <input type="button" onclick="calculate();" value="Calculate" class="btn primary" /> </div>

You need to to use FileAPI.您需要使用 FileAPI。 It is available in the latest FF & Chrome, but not IE9.它适用于最新的 FF 和 Chrome,但不适用于 IE9。 Grab any md5 JS implementation suggested above.获取上面建议的任何 md5 JS 实现。 I've tried this and abandoned it because JS was too slow (minutes on large image files).我已经尝试过这个并放弃了它,因为 JS 太慢(大图像文件需要几分钟)。 Might revisit it if someone rewrites MD5 using typed arrays.如果有人使用类型化数组重写 MD5,可能会重新访问它。

Code would look something like this:代码看起来像这样:

HTML:     
<input type="file" id="file-dialog" multiple="true" accept="image/*">

JS (w JQuery)

$("#file-dialog").change(function() {
  handleFiles(this.files);
});

function handleFiles(files) {
    for (var i=0; i<files.length; i++) {
        var reader = new FileReader();
        reader.onload = function() {
        var md5 = binl_md5(reader.result, reader.result.length);
            console.log("MD5 is " + md5);
        };
        reader.onerror = function() {
            console.error("Could not read the file");
        };
        reader.readAsBinaryString(files.item(i));
     }
 }

Apart from the impossibility to get file system access in JS, I would not put any trust at all in a client-generated checksum.除了不可能在 JS 中获得文件系统访问权限之外,我根本不会信任客户端生成的校验和。 So generating the checksum on the server is mandatory in any case.因此在任何情况下都必须在服务器上生成校验和。 – Tomalak Apr 20 '09 at 14:05 – 托马拉克 2009 年 4 月 20 日 14:05

Which is useless in most cases.这在大多数情况下是无用的。 You want the MD5 computed at client side, so that you can compare it with the code recomputed at server side and conclude the upload went wrong if they differ.您希望在客户端计算 MD5,以便您可以将其与在服务器端重新计算的代码进行比较,如果它们不同,则得出上传出错的结论。 I have needed to do that in applications working with large files of scientific data, where receiving uncorrupted files were key.我需要在处理大型科学数据文件的应用程序中这样做,其中接收未损坏的文件是关键。 My cases was simple, cause users had the MD5 already computed from their data analysis tools, so I just needed to ask it to them with a text field.我的案例很简单,因为用户已经从他们的数据分析工具中计算出 MD5,所以我只需要通过文本字段向他们询问。

The following snippet shows an example, which can archive a throughput of 400 MB/s while reading and hashing the file.以下代码段显示了一个示例,该示例可以在读取和散列文件时以 400 MB/s 的吞吐量存档。

It is using a library called hash-wasm , which is based on WebAssembly and calculates the hash faster than js-only libraries.它使用了一个名为hash-wasm的库,它基于 WebAssembly 并且比仅使用 js 的库更快地计算哈希。 As of 2020, all modern browsers support WebAssembly.截至 2020 年,所有现代浏览器都支持 WebAssembly。

 const chunkSize = 64 * 1024 * 1024; const fileReader = new FileReader(); let hasher = null; function hashChunk(chunk) { return new Promise((resolve, reject) => { fileReader.onload = async(e) => { const view = new Uint8Array(e.target.result); hasher.update(view); resolve(); }; fileReader.readAsArrayBuffer(chunk); }); } const readFile = async(file) => { if (hasher) { hasher.init(); } else { hasher = await hashwasm.createMD5(); } const chunkNumber = Math.floor(file.size / chunkSize); for (let i = 0; i <= chunkNumber; i++) { const chunk = file.slice( chunkSize * i, Math.min(chunkSize * (i + 1), file.size) ); await hashChunk(chunk); } const hash = hasher.digest(); return Promise.resolve(hash); }; const fileSelector = document.getElementById("file-input"); const resultElement = document.getElementById("result"); fileSelector.addEventListener("change", async(event) => { const file = event.target.files[0]; resultElement.innerHTML = "Loading..."; const start = Date.now(); const hash = await readFile(file); const end = Date.now(); const duration = end - start; const fileSizeMB = file.size / 1024 / 1024; const throughput = fileSizeMB / (duration / 1000); resultElement.innerHTML = ` Hash: ${hash}<br> Duration: ${duration} ms<br> Throughput: ${throughput.toFixed(2)} MB/s `; });
 <script src="https://cdn.jsdelivr.net/npm/hash-wasm"></script> <!-- defines the global `hashwasm` variable --> <input type="file" id="file-input"> <div id="result"></div>

To get the hash of files, there are a lot of options.要获取文件的哈希值,有很多选项。 Normally the problem is that it's really slow to get the hash of big files.通常问题是获取大文件的哈希值真的很慢。

I created a little library that get the hash of files, with the 64kb of the start of the file and the 64kb of the end of it.我创建了一个获取文件散列的小库,文件开头为 64kb,文件结尾为 64kb。

Live example: http://marcu87.github.com/hashme/ and library: https://github.com/marcu87/hashme现场示例: http : //marcu87.github.com/hashme/和图书馆: https : //github.com/marcu87/hashme

There is a couple scripts out there on the internet to create an MD5 Hash.互联网上有几个脚本可以创建 MD5 哈希。

The one from webtoolkit is good, http://www.webtoolkit.info/javascript-md5.html来自 webtoolkit 的一个很好, http://www.webtoolkit.info/javascript-md5.html

Although, I don't believe it will have access to the local filesystem as that access is limited.虽然,我不相信它可以访问本地文件系统,因为访问是有限的。

hope you have found a good solution by now.希望您现在已经找到了一个好的解决方案。 If not, the solution below is an ES6 promise implementation based on js-spark-md5如果没有,下面的解决方案是基于js-spark-md5的ES6 promise实现

import SparkMD5 from 'spark-md5';

// Read in chunks of 2MB
const CHUCK_SIZE = 2097152;

/**
 * Incrementally calculate checksum of a given file based on MD5 algorithm
 */
export const checksum = (file) =>
  new Promise((resolve, reject) => {
    let currentChunk = 0;
    const chunks = Math.ceil(file.size / CHUCK_SIZE);
    const blobSlice =
      File.prototype.slice ||
      File.prototype.mozSlice ||
      File.prototype.webkitSlice;
    const spark = new SparkMD5.ArrayBuffer();
    const fileReader = new FileReader();

    const loadNext = () => {
      const start = currentChunk * CHUCK_SIZE;
      const end =
        start + CHUCK_SIZE >= file.size ? file.size : start + CHUCK_SIZE;

      // Selectively read the file and only store part of it in memory.
      // This allows client-side applications to process huge files without the need for huge memory
      fileReader.readAsArrayBuffer(blobSlice.call(file, start, end));
    };

    fileReader.onload = e => {
      spark.append(e.target.result);
      currentChunk++;

      if (currentChunk < chunks) loadNext();
      else resolve(spark.end());
    };

    fileReader.onerror = () => {
      return reject('Calculating file checksum failed');
    };

    loadNext();
  });

With current HTML5 it should be possible to calculate the md5 hash of a binary file, But I think the step before that would be to convert the banary data BlobBuilder to a String, I am trying to do this step: but have not been successful.使用当前的 HTML5 应该可以计算二进制文件的 md5 哈希值,但我认为之前的步骤是将二进制数据 BlobBuilder 转换为字符串,我正在尝试执行此步骤:但尚未成功。

Here is the code I tried: Converting a BlobBuilder to string, in HTML5 Javascript这是我尝试过的代码: Converting a BlobBuilder to string, in HTML5 Javascript

I don't believe there is a way in javascript to access the contents of a file upload.我不相信 javascript 中有一种方法可以访问文件上传的内容。 So you therefore cannot look at the file contents to generate an MD5 sum.因此,您无法查看文件内容来生成 MD5 总和。

You can however send the file to the server, which can then send an MD5 sum back or send the file contents back .. but that's a lot of work and probably not worthwhile for your purposes.但是,您可以将文件发送到服务器,然后服务器可以发送回 MD5 总和或将文件内容发送回 .. 但这需要大量工作,对于您的目的来说可能不值得。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM