简体   繁体   English

如何在scalaz-stream中实现receiveAvailable传感器

[英]How to implement receiveAvailable transducer in scalaz-stream

Short Version: 简洁版本:

I would like to implement a function that returns a transducer that waits for a block of values to be "emitted". 我想实现一个函数,它返回一个等待一块值被“发出”的传感器。

The function I have in mind would have the following signature: 我想到的功能将具有以下签名:

/**
 * The `Process1` which awaits the next "effect" to occur and passes all values emitted by
 * this effect to `rcv` to determine the next state.
 */
def receiveBlock[I, O](rcv: Vector[I] => Process1[I,O]): Process1[I,O] = ???

Details: 细节:

My understanding is that I could then use this function to implement the following function which I think would be quite useful: 我的理解是我可以使用这个函数来实现以下函数,我觉得这个函数非常有用:

/**
  * Groups inputs into chunks of dynamic size based on the various effects
  * that back emitted values.
  *
  * @example {{{
  * val numberTask = Task.delay(1)
  * val listOfNumbersTask = Task.delay(List(5,6,7))
  * val sample = Process.eval(numberTask) ++ Process(2,3,4) ++ Process.await(listOfNumbersTask)(xs => Process.emitAll(xs))
  * sample.chunkByEffect.runLog.run should be List(Vector(1), Vector(2,3,4), Vector(5,6,7))
  * }}}
  */
  def chunkByEffect[I]: Process1[I, Vector[I]] = {
    receiveBlock(vec => emit(vec) ++ chunkByEffect)
  }

[Update] More Details [更新]更多详情

My ultimate objective (slightly simplified) is to implement the following function: 我的最终目标(略微简化)是实现以下功能:

/**
 * Transforms a stream of audio into a stream of text.
 */
voiceRecognition(audio: Process[Task, Byte]): Process[Task, String]

The function makes an external call to a voice recognition service. 该功能对语音识别服务进行外部呼叫。 Thus it is unreasonable to make a network call for every single Byte in the stream. 因此,对流中的每个Byte进行网络调用是不合理的。 I need to chunk bytes together before making a network call. 在进行网络呼叫之前,我需要将字节组合在一起。 I could make audio a Process[Task, ByteVector] but that would require testing code to know the maximum chunk size that the function supports, I would rather that be managed by the function itself. 我可以将audio Process[Task, ByteVector]但这需要测试代码才能知道函数支持的最大块大小,我宁愿由函数本身管理。 Also, when this service is being used inside of a service, the service will be itself receiving network calls with a given size of audio, I would like for the chunkXXX function to be smart about chunking so that it does not hold onto data that is already available. 此外,当在服务内部使用此服务时,该服务本身将接收具有给定大小的音频的网络调用,我希望chunkXXX功能对于分块是聪明的,这样它就不会保留数据。已有。

Basically, the stream of audio coming from the network will have the form Process[Task, ByteVector] and will be translated into a Process[Task, Byte] by flatMap(Process.emitAll(_)) . 基本上,来自网络的音频流将具有Process[Task, ByteVector]形式Process[Task, ByteVector]flatMap(Process.emitAll(_))转换为Process[Task, Byte] However, the test code will directly produce a Process[Task, Byte] and feed that into voiceRecognition . 但是,测试代码将直接生成Process[Task, Byte]并将其提供给voiceRecognition In theory, I believe it should be possible given the appropriate combinator to provide an implementation of voiceRecognition that does the right thing with both these streams and I think the chunkByEffect function described above is the key to that. 理论上,我认为应该有可能给定适当的组合器来提供一个voiceRecognition的实现,它对这两个流做正确的事情,我认为上面描述的chunkByEffect函数是关键。 I realize now that I would need the chunkByEffect function to have min and max parameter that specifies the minimum and maximum size of chunking irrespective of the underlying Task producing the bytes. 我现在意识到我需要chunkByEffect函数来使用minmax参数来指定分块的最小和最大大小,而不管生成字节的底层Task

You need to have your bytes separated somehow. 你需要以某种方式分离你的字节。 I suggest to work with some higher level abstraction over stream of Bytes, ie ByteVector. 我建议在字节流上使用更高级别的抽象,即ByteVector。

Then you will have perhaps to do manual process1, that is implemented similarly like process1.chunkBy only it operates on ByteVector. 然后你可能会做一些手动process1,它实现类似于process1.chunkBy它只在ByteVector上运行。 ie

def chunkBy(separator:ByteVector): Process1[ByteVector, ByteVector] = {
  def go(acc: ByteVector): Process1[ByteVector, ByteVector] =
    receive1Or[ByteVector,ByteVector](emit(acc)) { i =>
       // implement searching of separator in accumulated + new bytes
       ???
    }
  go(ByteVector.empty)
}

Then this will hook up everything together 然后,这将把所有东西连接在一起

val speech: Process[Task,ByteVector] = ???
def chunkByWhatever: Process1[ByteVector,ByteVector] = ??? 
val recognizer: Channel[Task,ByteVector,String] = ???

//this shall do the trick
speech.pipe(chunkByWhatever).through(recognizer)

I guess the answer at this point is that this is really hard or impossible to accomplish in scalaz-stream proper. 我想在这一点上的答案是,这在scalaz-stream确实很难或不可能完成。 The new version of this library is called fs2 and it has first-class support for "chunking" which is basically what I was looking for here. 这个库的新版本被称为fs2 ,它具有对“chunking”的一流支持,这基本上就是我在这里寻找的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM