[英]How to implement receiveAvailable transducer in scalaz-stream
I would like to implement a function that returns a transducer that waits for a block of values to be "emitted". 我想实现一个函数,它返回一个等待一块值被“发出”的传感器。
The function I have in mind would have the following signature: 我想到的功能将具有以下签名:
/**
* The `Process1` which awaits the next "effect" to occur and passes all values emitted by
* this effect to `rcv` to determine the next state.
*/
def receiveBlock[I, O](rcv: Vector[I] => Process1[I,O]): Process1[I,O] = ???
My understanding is that I could then use this function to implement the following function which I think would be quite useful: 我的理解是我可以使用这个函数来实现以下函数,我觉得这个函数非常有用:
/**
* Groups inputs into chunks of dynamic size based on the various effects
* that back emitted values.
*
* @example {{{
* val numberTask = Task.delay(1)
* val listOfNumbersTask = Task.delay(List(5,6,7))
* val sample = Process.eval(numberTask) ++ Process(2,3,4) ++ Process.await(listOfNumbersTask)(xs => Process.emitAll(xs))
* sample.chunkByEffect.runLog.run should be List(Vector(1), Vector(2,3,4), Vector(5,6,7))
* }}}
*/
def chunkByEffect[I]: Process1[I, Vector[I]] = {
receiveBlock(vec => emit(vec) ++ chunkByEffect)
}
My ultimate objective (slightly simplified) is to implement the following function: 我的最终目标(略微简化)是实现以下功能:
/**
* Transforms a stream of audio into a stream of text.
*/
voiceRecognition(audio: Process[Task, Byte]): Process[Task, String]
The function makes an external call to a voice recognition service. 该功能对语音识别服务进行外部呼叫。 Thus it is unreasonable to make a network call for every single
Byte
in the stream. 因此,对流中的每个
Byte
进行网络调用是不合理的。 I need to chunk bytes together before making a network call. 在进行网络呼叫之前,我需要将字节组合在一起。 I could make
audio
a Process[Task, ByteVector]
but that would require testing code to know the maximum chunk size that the function supports, I would rather that be managed by the function itself. 我可以将
audio
Process[Task, ByteVector]
但这需要测试代码才能知道函数支持的最大块大小,我宁愿由函数本身管理。 Also, when this service is being used inside of a service, the service will be itself receiving network calls with a given size of audio, I would like for the chunkXXX
function to be smart about chunking so that it does not hold onto data that is already available. 此外,当在服务内部使用此服务时,该服务本身将接收具有给定大小的音频的网络调用,我希望
chunkXXX
功能对于分块是聪明的,这样它就不会保留数据。已有。
Basically, the stream of audio coming from the network will have the form Process[Task, ByteVector]
and will be translated into a Process[Task, Byte]
by flatMap(Process.emitAll(_))
. 基本上,来自网络的音频流将具有
Process[Task, ByteVector]
形式Process[Task, ByteVector]
由flatMap(Process.emitAll(_))
转换为Process[Task, Byte]
。 However, the test code will directly produce a Process[Task, Byte]
and feed that into voiceRecognition
. 但是,测试代码将直接生成
Process[Task, Byte]
并将其提供给voiceRecognition
。 In theory, I believe it should be possible given the appropriate combinator to provide an implementation of voiceRecognition
that does the right thing with both these streams and I think the chunkByEffect
function described above is the key to that. 理论上,我认为应该有可能给定适当的组合器来提供一个
voiceRecognition
的实现,它对这两个流做正确的事情,我认为上面描述的chunkByEffect
函数是关键。 I realize now that I would need the chunkByEffect function to have min
and max
parameter that specifies the minimum and maximum size of chunking irrespective of the underlying Task
producing the bytes. 我现在意识到我需要chunkByEffect函数来使用
min
和max
参数来指定分块的最小和最大大小,而不管生成字节的底层Task
。
You need to have your bytes separated somehow. 你需要以某种方式分离你的字节。 I suggest to work with some higher level abstraction over stream of Bytes, ie ByteVector.
我建议在字节流上使用更高级别的抽象,即ByteVector。
Then you will have perhaps to do manual process1, that is implemented similarly like process1.chunkBy
only it operates on ByteVector. 然后你可能会做一些手动process1,它实现类似于
process1.chunkBy
它只在ByteVector上运行。 ie 即
def chunkBy(separator:ByteVector): Process1[ByteVector, ByteVector] = {
def go(acc: ByteVector): Process1[ByteVector, ByteVector] =
receive1Or[ByteVector,ByteVector](emit(acc)) { i =>
// implement searching of separator in accumulated + new bytes
???
}
go(ByteVector.empty)
}
Then this will hook up everything together 然后,这将把所有东西连接在一起
val speech: Process[Task,ByteVector] = ???
def chunkByWhatever: Process1[ByteVector,ByteVector] = ???
val recognizer: Channel[Task,ByteVector,String] = ???
//this shall do the trick
speech.pipe(chunkByWhatever).through(recognizer)
I guess the answer at this point is that this is really hard or impossible to accomplish in scalaz-stream
proper. 我想在这一点上的答案是,这在
scalaz-stream
确实很难或不可能完成。 The new version of this library is called fs2
and it has first-class support for "chunking" which is basically what I was looking for here. 这个库的新版本被称为
fs2
,它具有对“chunking”的一流支持,这基本上就是我在这里寻找的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.