[英]Bucketed Sink in scalaz-stream
I am trying to make a sink that would write a stream to bucketed files: when a particular condition is reached (time, size of file, etc.) is reached, the current output stream is closed and a new one is opened to a new bucket file. 我正在尝试创建一个将流写入分段文件的接收器:当达到特定条件(时间,文件大小等)时,关闭当前输出流并打开一个新的输出流桶文件。
I checked how the different sinks were created in the io
object, but there aren't many examples. 我检查了如何在
io
对象中创建不同的接收器,但是没有很多示例。 So I trIed to follow how resource
and chunkW
were written. 所以我想了解
resource
和chunkW
的编写方式。 I ended up with the following bit of code, where for simplicity, buckets are just represented by an Int
for now, but would eventually be some type output streams. 我最后得到了以下代码,为简单起见,现在只用
Int
表示存储桶,但最终会出现一些类型的输出流。
val buckets: Channel[Task, String, Int] = {
//recursion to step through the stream
def go(step: Task[String => Task[Int]]): Process[Task, String => Task[Int]] = {
// Emit the value and repeat
def next(msg: String => Task[Int]) =
Process.emit(msg) ++
go(step)
Process.await[Task, String => Task[Int], String => Task[Int]](step)(
next
, Process.halt // TODO ???
, Process.halt) // TODO ???
}
//starting bucket
val acquire: Task[Int] = Task.delay {
val startBuck = nextBucket(0)
println(s"opening bucket $startBuck")
startBuck
}
//the write step
def step(os: Int): Task[String => Task[Int]] =
Task.now((msg: String) => Task.delay {
write(os, msg)
val newBuck = nextBucket(os)
if (newBuck != os) {
println(s"closing bucket $os")
println(s"opening bucket $newBuck")
}
newBuck
})
//start the Channel
Process.await(acquire)(
buck => go(step(buck))
, Process.halt, Process.halt)
}
def write(bucket: Int, msg: String) { println(s"$bucket\t$msg") }
def nextBucket(b: Int) = b+1
There are a number of issues in this: 这有很多问题:
step
is passed the bucket once at the start and this never changes during the recursion. step
在开始时被传递一次,这在递归期间永远不会改变。 I am not sure how in the recursive go
to create a new step
task that will use the bucket (Int) from the previous task, as I have to provide a String to get to that task. go
创建一个新的step
,将使用桶(INT)从以前的任务,因为我必须提供一个字符串来获得该任务的任务。 fallback
and cleanup
of the await
calls do not receive the result of rcv
(if there is one). await
调用的fallback
和cleanup
不会收到rcv
的结果(如果有的话)。 In the io.resource
function, it works fine as the resource is fixed, however, in my case, the resource might change at any step. io.resource
函数中,它在资源修复时工作正常,但在我的情况下,资源可能在任何步骤都会发生变化。 How would I go to pass the reference to the current open bucket to these callbacks? Well one of the options (ie time) may be to use simple go
on the sink. 其中一个选项(即时间)可能是使用简单的
go
水槽。 This one uses time based, essentially reopening file every single hour: 这个基于时间,基本上每小时重新打开文件:
val metronome = Process.awakeEvery(1.hour).map(true)
def writeFileSink(file:String):Sink[Task,ByteVector] = ???
def timeBasedSink(prefix:String) = {
def go(index:Int) : Sink[Task,ByteVector] = {
metronome.wye(write(prefix + "_" + index))(wye.interrupt) ++ go(index + 1)
}
go(0)
}
for the other options (ie bytes written) you can use similar technique, just keep signal of bytes written and combine it with Sink. 对于其他选项(即写入的字节),您可以使用类似的技术,只保留写入的字节信号并将其与Sink结合使用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.