Spark Streaming：如何有效地将foreachRDD数据保存到Mysql数据库中？

Question

We are going to build a real-time computation system, Also want to save processed data into Mysql Database, here's the code below: 我们将建立一个实时计算系统，也想将处理后的数据保存到Mysql数据库中，下面是下面的代码：

splitWordInfo.foreachRDD(new Function<JavaRDD<String>, Void>() {
        private static final long serialVersionUID = 1L;

        @Override
        public Void call(JavaRDD<String> rdd) throws Exception {
            rdd.foreachPartition(new VoidFunction<Iterator<String>>() {
                // Default Serial ID
                private static final long serialVersionUID = 1L;
                @Override
                public void call(Iterator<String> eachline) throws Exception {
                    String sql = "insert into test_mm(name,addr) values(?)";
                    Connection conn = DriverManager.getConnection("jdbc:mysql://xx.xx.xx.xx:3306/dbname", "user", "pass");
                    PreparedStatement stat = conn.prepareStatement(sql); 
                    while(eachline.hasNext()){
                        stat.setString(1, eachline.next());
                        stat.executeUpdate();
                    }
                    stat.close();
                    conn.close();
                }

            });
            return null;
        }
    });

Does it will open/close mysql connection for each rdd, or for each partition? 是否会为每个rdd或每个分区打开/关闭mysql连接？

And How to efficiently save foreachRDD data into Mysql database. 以及如何有效地将foreachRDD数据保存到Mysql数据库中。 Could anyone do me a favor? 有人能帮我一个忙吗？

Answer 1

Each RDD partition is like a separate task and your program will get a connection for each partition. 每个RDD分区就像一个单独的任务，您的程序将为每个分区获得连接。 It is good to use a connection pool library like Hikari or Tomcat . 最好使用Hikari或Tomcat之类的连接池库。 But even with connection pool there will be a cost of communication with database. 但是，即使有了连接池，与数据库的通信也要付出一定的代价。 That you can not avoid in this model. 在这种模式下您无法避免。

Spark Streaming：如何有效地将foreachRDD数据保存到Mysql数据库中？

问题描述

1 个解决方案

解决方案1
0 已采纳 2016-09-23 08:51:12

Spark Streaming：如何有效地将foreachRDD数据保存到Mysql数据库中？

问题描述

1 个解决方案

解决方案1 0 已采纳 2016-09-23 08:51:12

解决方案1
0 已采纳 2016-09-23 08:51:12