简体   繁体   English

使用scalaz-stream计算摘要

[英]Using scalaz-stream to calculate a digest

So I was wondering how I might use scalaz-stream to generate the digest of a file using java.security.MessageDigest? 所以我想知道如何使用scalaz-stream使用java.security.MessageDigest生成文件的摘要?

I would like to do this using a constant memory buffer size (for example 4KB). 我想使用恒定的内存缓冲区大小(例如4KB)来做到这一点。 I think I understand how to start with reading the file, but I am struggling to understand how to: 我想我知道如何开始阅读文件,但我很难理解如何:

1) call digest.update(buf) for each 4KB which effectively is a side-effect on the Java MessageDigest instance, which I guess should happen inside the scalaz-stream framework. 1)为每个4KB调用digest.update(buf) ,这实际上是Java MessageDigest实例的副作用,我想这应该发生在scalaz-stream框架内。

2) finally call digest.digest() to receive back the calculated digest from within the scalaz-stream framework some how? 2)最后调用digest.digest()从scalaz-stream框架内接收回计算的摘要一些如何?

I think I understand kinda how to start: 我想我明白了如何开始:

import scalaz.stream._
import java.security.MessageDigest

val f = "/a/b/myfile.bin"
val bufSize = 4096

val digest = MessageDigest.getInstance("SHA-256")

Process.constant(bufSize).toSource
  .through(io.fileChunkR(f, bufSize))

But then I am stuck! 但后来我被卡住了! Any hints please? 有什么提示吗? I guess it must also be possible to wrap the creation, update, retrieval (of actual digest calculatuon) and destruction of digest object in a scalaz-stream Sink or something, and then call .to() passing in that Sink? 我想还必须能够在scalaz-stream Sink等中包装创建,更新,检索(实际摘要计算)和摘要对象的破坏,然后调用.to()传入那个Sink? Sorry if I am using the wrong terminology, I am completely new to using scalaz-stream. 对不起,如果我使用错误的术语,我对使用scalaz-stream完全不熟悉。 I have been through a few of the examples but am still struggling. 我已经通过了一些例子,但我仍在努力。

Since version 0.4 scalaz-stream contains processes to calculate digests. 由于版本0.4 scalaz-stream包含计算摘要的过程。 They are available in the hash module and use java.security.MessageDigest under the hood. 它们在hash模块中可用,并在引擎盖下使用java.security.MessageDigest Here is a minimal example how you could use them: 以下是如何使用它们的最小示例:

import scalaz.concurrent.Task
import scalaz.stream._

object Sha1Sum extends App {
  val fileName = "testdata/celsius.txt"
  val bufferSize = 4096

  val sha1sum: Task[Option[String]] =
    Process.constant(bufferSize)
      .toSource
      .through(io.fileChunkR(fileName, bufferSize))
      .pipe(hash.sha1)
      .map(sum => s"${sum.toHex}  $fileName")
      .runLast

  sha1sum.run.foreach(println)
}

The update() and digest() calls are all contained inside the hash.sha1 Process1 . update()digest()调用都包含在hash.sha1 Process1

So I have something working, but it could probably be improved: 所以我有一些工作,但它可能会改进:

import java.io._
import java.security.MessageDigest
import resource._
import scodec.bits.ByteVector
import scalaz._, Scalaz._
import scalaz.concurrent.Task
import scalaz.stream._
import scalaz.stream.io._

val f = "/a/b/myfile.bin"
val bufSize = 4096

val md = MessageDigest.getInstance("SHA-256")

def _digestResource(md: => MessageDigest): Sink[Task,ByteVector] =
      resource(Task.delay(md))(md => Task.delay(()))(
        md => Task.now((bytes: ByteVector) => Task.delay(md.update(bytes.toArray))))

Process.constant(4096).toSource
    .through(fileChunkR(f.getAbsolutePath, 4096))
    .to(_digestResource(md))
    .run
    .run

md.digest()

However, it seems to me that there should be a cleaner way to do this, by moving the creation of the MessageDigest inside the scalaz-stream stuff and have the final .run yield the md.digest() . 但是,在我看来应该有一个更简洁的方法,通过在scalaz-stream内部移动MessageDigest的创建,让最终的.run产生md.digest()

Better answers welcome... 更好的答案欢迎......

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM