简体   繁体   English

如何在DataStream上进行迭代

[英]How to iterate over DataStream

I am a newbie to scala. 我是scala的新手。 I have a custom class Analytics.scala which has few variables (var a, var b, var c). 我有一个自定义类Analytics.scala,它具有很少的变量(var a,var b,var c)。 I get a DataStream of type Analytics in my test case and I want to set value of var c as '0' for every object. 我在测试用例中获得了Analytics类型的DataStream,并且我想为每个对象将var c的值设置为“ 0”。

I've tried using map function over DataStream but it didn't help. 我试过在DataStream上使用map函数,但没有帮助。 I also tried converting stream to list and then iterating over that list but that didn't work either. 我还尝试将流转换为列表,然后在该列表上进行迭代,但这也不起作用。

stream is of type DataStream[Analytics]. 流的类型为DataStream [Analytics]。 This is what I have tried: 这是我尝试过的:

stream.map(x => x.c=0)
val a = DataStreamUtils.collect(stream.javaStream).asScala.toArray.iterator
a.foreach(x => x.c=0)

value of var c doesn't change to 0 in my test case. 在我的测试案例中,var c的值不会更改为0。

In general, a Flink DataStream isn't a finite collection you can iterate over once and be done -- it's a potentially unbounded stream that just keeps having more data. 通常,Flink DataStream并不是一个有限的集合,您可以迭代一次并完成它-它是一个潜在的无限流,只会不断拥有更多数据。

Using a map is the right way to go. 使用地图是正确的方法。 But when you apply a map to a stream, as in 但是,当您将地图应用于流时,例如

stream.map(x => x.c=0)

you are describing a stream transformation, and not modifying the stream itself. 您正在描述流转换,而不是修改流本身。 You should instead try 您应该尝试

streamWhereCisZero = stream.map(x => x.c=0)

This creates a new stream where every element will have c set to zero. 这将创建一个新的流,其中每个元素的c都将设置为零。

This is how I iterated. 这就是我的迭代方式。 Not sure if this is the best solution. 不知道这是否是最好的解决方案。

val collection = DataStreamUtils.collect(stream.javaStream)
val results: Seq[Analytics] = collection.asScala.toSeq
for (result <- results){
    result.c=0
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM