简体   繁体   English

Sink flink DataStream 使用 jdbc 连接器连接到 mysql sink with overwrite

[英]Sink flink DataStream using jdbc connector to mysql sink with overwrite

My use case is我的用例是

  1. Get Data from AWS Kinesis Data stream and filter/map using flink data stream api从 AWS Kinesis Data stream 获取数据并使用 flink 数据进行过滤/映射 stream api
  2. Use StreamTable Environment to group and aggregate data使用 StreamTable 环境对数据进行分组和聚合
  3. Use SQLTableEnvironment to write to mysql using JDBC Connector使用 SQLTableEnvironment 使用 JDBC 连接器写入 mysql

I am able to write my datastream results into mySQL table but due to streaming its appending the each new row, while i want to overwrite.我能够将我的数据流结果写入 mySQL 表,但由于流式传输它附加了每个新行,而我想覆盖。

    consumerConfig.put(AWSConfigConstants.AWS_REGION, "eu-central-1");
    consumerConfig.put(ConsumerConfigConstants.STREAM_INITIAL_POSITION, "LATEST");


    StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    env.enableCheckpointing(5000);
    EnvironmentSettings bsSettings = EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build();
    StreamTableEnvironment tEnv = StreamTableEnvironment.create(env, bsSettings);

    // Parse Message
    DataStream<Event> events = env.addSource(
            new FlinkKinesisConsumer<>(
                    Config.INPUT_STREAM,
                    new KinesisEventDeserializationSchema(),
                    consumerConfig
            )
    )
            .uid("kinesisEventSource");
      ....    
      ....
      ....

      SingleOutputStreamOperator<ArticleView> filteredDetailsViewEvents = articleViews
            .filter(new FilterFunction<ArticleView>() {
                @Override
                public boolean filter(ArticleView event) throws Exception {
                    return StringUtils.isNotBlank(event.getArticleNumber());
                }
            })
            .uid("filteredDetailsViewFilter");
    
   
    Table t=tEnv.fromDataStream(filteredDetailsViewEvents);

  
    tEnv.executeSql("CREATE TABLE eventsSlider1 (\n" +
            "  articleNumber String,\n" +
            "  mandant String,\n" +
            "  category STRING,\n" +
            "  cnt BIGINT NOT NULL,\n" +
            " CONSTRAINT pk_event PRIMARY KEY (articleNumber,mandant,category) NOT ENFORCED\n" +
            ") WITH (\n" +
            "   'connector.type' = 'jdbc',\n" +

            "   'connector.url' = 'jdbc:mysql://localhost:3306/events',\n" +
            "   'connector.table' = 'categorySliderItems',\n" +
            "   'connector.username' = 'root',\n" +
            "   'connector.password' = '123456'\n" 

            ")");

   tEnv.executeSql("INSERT INTO eventsSlider1 (SELECT articleNumber,mandant,category,cnt "+
            "FROM ("+
            " SELECT articleNumber,mandant,category,count(articleNumber) as cnt,"+
            " ROW_NUMBER() OVER (PARTITION BY mandant,category ORDER BY count(articleNumber) DESC) as row_num"+
            " FROM "+t+" group by articleNumber,category, mandant)"+
            " WHERE row_num <= 3)");

the problem was that i did not set the proper primary key in the table.问题是我没有在表中设置正确的主键。 as primary key was the the only thing that flink can check in upsert operations and choose update or insert operation.因为主键是 flink 唯一可以检查 upsert 操作并选择更新或插入操作的东西。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM