使用scalaz-stream计算摘要

Question

So I was wondering how I might use scalaz-stream to generate the digest of a file using java.security.MessageDigest? 所以我想知道如何使用scalaz-stream使用java.security.MessageDigest生成文件的摘要？

I would like to do this using a constant memory buffer size (for example 4KB). 我想使用恒定的内存缓冲区大小（例如4KB）来做到这一点。 I think I understand how to start with reading the file, but I am struggling to understand how to: 我想我知道如何开始阅读文件，但我很难理解如何：

1) call digest.update(buf) for each 4KB which effectively is a side-effect on the Java MessageDigest instance, which I guess should happen inside the scalaz-stream framework. 1）为每个4KB调用digest.update(buf) ，这实际上是Java MessageDigest实例的副作用，我想这应该发生在scalaz-stream框架内。

2) finally call digest.digest() to receive back the calculated digest from within the scalaz-stream framework some how? 2）最后调用digest.digest()从scalaz-stream框架内接收回计算的摘要一些如何？

I think I understand kinda how to start: 我想我明白了如何开始：

import scalaz.stream._
import java.security.MessageDigest

val f = "/a/b/myfile.bin"
val bufSize = 4096

val digest = MessageDigest.getInstance("SHA-256")

Process.constant(bufSize).toSource
  .through(io.fileChunkR(f, bufSize))

But then I am stuck! 但后来我被卡住了！ Any hints please? 有什么提示吗？ I guess it must also be possible to wrap the creation, update, retrieval (of actual digest calculatuon) and destruction of digest object in a scalaz-stream Sink or something, and then call .to() passing in that Sink? 我想还必须能够在scalaz-stream Sink等中包装创建，更新，检索（实际摘要计算）和摘要对象的破坏，然后调用.to()传入那个Sink？ Sorry if I am using the wrong terminology, I am completely new to using scalaz-stream. 对不起，如果我使用错误的术语，我对使用scalaz-stream完全不熟悉。 I have been through a few of the examples but am still struggling. 我已经通过了一些例子，但我仍在努力。

Answer 1

Since version 0.4 scalaz-stream contains processes to calculate digests. 由于版本0.4 scalaz-stream包含计算摘要的过程。 They are available in the hash module and use java.security.MessageDigest under the hood. 它们在hash模块中可用，并在引擎盖下使用java.security.MessageDigest 。 Here is a minimal example how you could use them: 以下是如何使用它们的最小示例：

import scalaz.concurrent.Task
import scalaz.stream._

object Sha1Sum extends App {
  val fileName = "testdata/celsius.txt"
  val bufferSize = 4096

  val sha1sum: Task[Option[String]] =
    Process.constant(bufferSize)
      .toSource
      .through(io.fileChunkR(fileName, bufferSize))
      .pipe(hash.sha1)
      .map(sum => s"${sum.toHex}  $fileName")
      .runLast

  sha1sum.run.foreach(println)
}

The update() and digest() calls are all contained inside the hash.sha1 Process1 . update()和digest()调用都包含在hash.sha1 Process1 。

Answer 2

So I have something working, but it could probably be improved: 所以我有一些工作，但它可能会改进：

import java.io._
import java.security.MessageDigest
import resource._
import scodec.bits.ByteVector
import scalaz._, Scalaz._
import scalaz.concurrent.Task
import scalaz.stream._
import scalaz.stream.io._

val f = "/a/b/myfile.bin"
val bufSize = 4096

val md = MessageDigest.getInstance("SHA-256")

def _digestResource(md: => MessageDigest): Sink[Task,ByteVector] =
      resource(Task.delay(md))(md => Task.delay(()))(
        md => Task.now((bytes: ByteVector) => Task.delay(md.update(bytes.toArray))))

Process.constant(4096).toSource
    .through(fileChunkR(f.getAbsolutePath, 4096))
    .to(_digestResource(md))
    .run
    .run

md.digest()

However, it seems to me that there should be a cleaner way to do this, by moving the creation of the MessageDigest inside the scalaz-stream stuff and have the final .run yield the md.digest() . 但是，在我看来应该有一个更简洁的方法，通过在scalaz-stream内部移动MessageDigest的创建，让最终的.run产生md.digest() 。

Better answers welcome... 更好的答案欢迎......

使用scalaz-stream计算摘要

问题描述

2 个解决方案

解决方案1
3 已采纳 2014-08-20 18:54:17

解决方案2
0 2014-08-20 08:50:56

使用scalaz-stream计算摘要

问题描述

2 个解决方案

解决方案1 3 已采纳 2014-08-20 18:54:17

解决方案2 0 2014-08-20 08:50:56

解决方案1
3 已采纳 2014-08-20 18:54:17

解决方案2
0 2014-08-20 08:50:56