Need to write a method that will take in list of column names & list of column types (JDBC) & return a StructType
which will be used to create a DataFrame.
I know I can write a method with a bunch of case
statements to convert JDBC column type to appropriate DataType (such as StringType, IntegerType etc), but wondering if such a method already exists.
There's a DataType.fromJson
method, but I don't know/understand the structure of the JSON I need to pass to it.
Example input:
List of column names: UserName, Age, Salary
List of column types: java.lang.String, java.lang.Long, java.lang.Double
If you have access to JDBC source with a table having a given schema you can simply copy from there:
val jdbcOptions: Map[String, String] = ???
val jdbcSchema = sqlContext.load("jdbc", jdbcOptions).schema
JSON representation is quite simple. Each StructField
is represented as document with fields metadata
, name
, nullable
and type
.
{"metadata":{},"name":"f","nullable":true,"type":"string"}
For most applications you can ignore metadata
and focus on the remaining three. Tricky part is mapping from Java class to type
, but a naive solution can look like this:
import net.liftweb.json.JsonDSL._
import net.liftweb.json.{compact, render}
val columns = Seq(
("UserName", "java.lang.String"),
("Age", "java.lang.Long"),
("Salary", "java.lang.Double")
).map{case (n, t) => (n, t.split("\\.").last.toLowerCase)}
val fields = columns.map {case (n, t) => (
("metadata" -> Map.empty[String, String]) ~
("name" -> n) ~
("nullable" -> false) ~
("type" -> t)
)}
val schemaJSON = compact(render(("fields" -> fields) ~ ("type" -> "struct"))
val schema = DataType.fromJson(schemaJSON).asInstanceOf[StructType]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.