简体   繁体   English

如何使用Java在Apache Flink中对DataStream执行平均操作

[英]How to perform average operation on DataStream in Apache Flink using Java

I am trying to calculate the average for input datastream (no windows) in Flink 我正在尝试计算Flink中输入数据流的平均值(无窗口)

I have used a mapper to change the stream from (key, value) to (key, value, 1) 我已经使用了一个映射器将流从(key,value)更改为(key,value,1)

Now I need to sum on both 2nd and 3rd field and divide them by each other. 现在,我需要在第二和第三字段上求和,并将它们彼此除。

Input data stream is from socket connection in form of 'KEY VALUE' like 'X 5' 输入数据流来自套接字连接,格式为“ X 5”,形式为“键值”

public class AvgViews {

DataStream<Tuple2<String, Double>> AvgViewStream = dataStream
                .map(new AvgViews.RowSplitter())
                .keyBy(0)
                //.??? 



    public static class RowSplitter implements
            MapFunction<String, Tuple3<String, Double, Integer>> {

        public Tuple3<String, Double, Integer> map(String row)
                throws Exception {
            String[] fields = row.split(" ");
            if (fields.length == 2) {
                return new Tuple3<String, Double, Integer>(
                        fields[0],
                        Double.parseDouble(fields[1]),
                        1);
            }
            return null;
        }
    }
}

You can use a RichMap (or RichFlatMap) that keeps a Tuple2 in keyed state. 您可以使用使Tuple2保持键控状态的RichMap(或RichFlatMap)。 You'll want to add each incoming record to the state, and emit the average as the output. 您需要将每个传入记录添加到状态,并发出平均值作为输出。

The CountWindowAverage example in the docs does something similar, though a bit more complex. 文档中的CountWindowAverage示例执行了类似的操作,尽管稍微复杂一些。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 DataStream上的Flink SQL查询(Apache Flink Java) - Flink sql Query on DataStream (Apache Flink Java) 使用Flink DataStream计算窗口持续时间的平均值 - Calculate average using Flink DataStream for a window duration 如何为 Flink DataStream 执行简单的中值算法(最好在 Java 和 Flink 1.14 中)? - How do I perform a simple median algorithm for a Flink DataStream (preferably in Java and Flink 1.14)? 将包含 3 列的 CSV 文件读入 Datastream。 JAVA Apache Flink - Read CSV file with 3 columns into Datastream. JAVA Apache Flink Apache Flink:如何计算DataStream中的事件总数 - Apache Flink: How to count the total number of events in a DataStream Apache Flink 将 DataStream(源)转换为 List? - Apache Flink transform DataStream (source) to a List? Apache Flink:为 DataStream 添加侧输入 API - Apache Flink : Add side inputs for DataStream API 为什么Apache Flink从数据流中删除事件? - Why is Apache Flink droping the event from datastream? apache flink 0.10 如何从无界输入数据流中第一次出现复合键? - apache flink 0.10 how to get the first occurence of a composite key from an unbounded input dataStream? Flink DataStream 如何将一个自定义的 POJO 组合成另一个 DataStream - Flink How do DataStream combine a custom POJO into another DataStream
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM