[英]Writing DataFrame to MemSQL Table in Spark
Im trying to load a .parquet file into a MemSQL Database with Spark and MemSQL Connector. 我正在尝试使用Spark和MemSQL Connector将.parquet文件加载到MemSQL数据库中。
package com.memsql.spark
import com.memsql.spark.context._
import org.apache.spark._
import org.apache.spark.sql._
import org.apache.spark.sql.types._
import com.memsql.spark.connector._
import com.mysql.jdbc._
object readParquet {
def main(args: Array[String]){
val conf = new SparkConf().setAppName("ReadParquet")
val sc = new SparkContext(conf)
sc.addJar("/data/applications/spark-1.5.1-bin-hadoop2.6/lib/mysql-connector-java-5.1.37-bin.jar")
sc.addJar("/data/applications/spark-1.5.1-bin-hadoop2.6/lib/memsql-connector_2.10-1.1.0.jar")
Class.forName("com.mysql.jdbc.Driver")
val host = "xxxx"
val port = 3306
val dbName = "WP1"
val user = "root"
val password = ""
val tableName = "rt_acc"
val memsqlContext = new com.memsql.spark.context.MemSQLContext(sc, host, port, user, password)
val rt_acc = memsqlContext.read.parquet("tachyon://localhost:19998/rt_acc.parquet")
val func_rt_acc = new com.memsql.spark.connector.DataFrameFunctions(rt_acc)
func_rt_acc.saveToMemSQL(dbName, tableName, host, port, user, password)
}
}
I'm fairly certain that Tachyon is not causing the problem, as the same exceptions occur if loaded from disk and i can use sql-queries on the dataframe. 我相当确定Tachyon不会引起问题,因为如果从磁盘加载,也会发生相同的异常,并且我可以在数据帧上使用sql查询。 I've seen people suggest df.saveToMemSQL(..) however it seems this method is in DataFrameFunctions now. 我见过有人建议使用df.saveToMemSQL(..),但是现在看来此方法在DataFrameFunctions中。
Also the table doesnt exist yet but saveToMemSQL should do CREATE TABLE as documentation and source code tell me. 该表还不存在,但是saveToMemSQL应该做CREATE TABLE,因为文档和源代码告诉我。
Edit: Ok i guess i misread something. 编辑:好的,我想我读错了什么。 saveToMemSQL doesn't create the table. saveToMemSQL不会创建表。 Thanks. 谢谢。
Try using createMemSQLTableAs
instead of saveToMemSQL
. 尝试使用createMemSQLTableAs
而不是saveToMemSQL
。
saveToMemSQL
loads a dataframe into an existing table, where as createMemSQLTableAs
creates the table and then loads it. saveToMemSQL
将数据帧加载到现有表中,与createMemSQLTableAs
在其中创建表,然后将其加载。 It also returns a handy dataframe wrapping that MemSQL table :). 它还返回一个方便的数据框,用于包装MemSQL表:)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.