简体   繁体   English

spark javaRDD 输出到数据库

[英]spark javaRDD output to database

Please help me understand what would be the best way to save output of spark javaRDD into database?请帮助我了解将spark javaRDD输出保存到数据库中的最佳方法是什么?

Should I write spark java code to save RDD into database?我应该编写 spark java 代码来将RDD保存到数据库中吗? What would be drawback of this approach ?这种方法的缺点是什么?

Or I should use sqoop to save output files into database?或者我应该使用sqoop将输出文件保存到数据库中?

Is there any other way to to this?有没有其他方法可以做到这一点?

Thanks谢谢

used dataframe and saved data into sql server使用数据框并将数据保存到sql server

SQLContext sqlcontext=new SQLContext(context);
DataFrame outDataFrame=sqlcontext.createDataFrame(finalOutPutRDD, WebHttpOutPutVO.class);
Properties prop = new java.util.Properties();
prop.setProperty("database", "Web_Session");
prop.setProperty("user", "user");
prop.setProperty("password", "pwd@123");
prop.setProperty("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver");
outDataFrame.write().mode(org.apache.spark.sql.SaveMode.Append).jdbc("jdbc:sqlserver://<Host>:1433", "test_table", prop);

There are two approaches you can use for writing your results back to the database.您可以使用两种方法将结果写回数据库。

  1. Use something like DBOutputFormat and configure that使用类似 DBOutputFormat 的东西并配置它

  2. Use foreachPartition on the RDD you want to save and pass in a function which creates a connection to MySQL and writes the result back.在要保存的 RDD 上使用 foreachPartition 并传入一个函数,该函数创建与 MySQL 的连接并将结果写回。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM