简体   繁体   English

将包含 3 列的 CSV 文件读入 Datastream。 JAVA Apache Flink

[英]Read CSV file with 3 columns into Datastream. JAVA Apache Flink

I've been struggling for a while setting up a flink application that creates a Datastream<Tuple3<Integer, java.sql.Time, Double>> from a csv file.我一直在努力设置一个从 Z628CB5675FF524F3E719B7AA2E88FE3 文件创建Datastream<Tuple3<Integer, java.sql.Time, Double>>的 flink 应用程序。 The columns in this file (columns ID, dateTime and Result ) are all String but they should be converted to Integer, java.sql.Time and Double.此文件中的列(列ID, dateTime and Result )都是 String,但它们应转换为 Integer、java.sql.Time 和 Double。 The other thing I want is to create tumbling windows with data per day and average the values of the result column in that window.我想要的另一件事是使用每天的数据创建翻滚 windows 并平均该 window 中result列的值。 The problem is that I dont know the exact syntax for it.问题是我不知道它的确切语法。 See my code below what I tried.请参阅我尝试过的代码。 The last part I have sum(2), but I want to calculate the average for the windows.最后一部分我有 sum(2),但我想计算 windows 的平均值。 I did not see in a function for this in the documentation.我在文档中没有看到 function 中的这个。 Do I need to write a method myself for this?我需要为此自己编写一个方法吗?


DataStream<Tuple3<String, java.sql.Time>> dataStream = env
                .readfile(path)
                .map()
                .keyBy(0)
                .timeWindow(Time.days(1));

You can use your own logic to read csv or use library like univocity_parsers.您可以使用自己的逻辑来读取 csv 或使用像 univocity_parsers 这样的库。 And than instead of using env.而不是使用 env。 readfile you can use env. readfile你可以使用 env. fromCollection (list).从集合(列表)。

Here is the link of library In case you want: https://www.univocity.com/pages/univocity_parsers_tutorial#using-annotations-to-map-your-java-beans这是图书馆的链接如果你想要: https://www.univocity.com/pages/univocity_parsers_tutorial#using-annotations-to-map-your-java-beans

You can give your own converter with anotaion @Convert (conversionClass = YourDataTimeCoverter .class)您可以使用 anotaion @Convert (conversionClass = YourDataTimeCoverter .class) 提供自己的转换器

For average refer following flink documentation with example:.对于平均值,请参阅以下 flink 文档并附有示例:。

https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/windows.html#aggregatefunction https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/windows.html#aggregatefunction

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM