[英]How to parse a YAML with spark/scala
I have yaml file with following details.我有 yaml 文件,其中包含以下详细信息。 file name: config.yml
文件名:config.yml
- firstName: "James"
lastName: "Bond"
age: 30
- firstName: "Super"
lastName: "Man"
age: 25
From this I need to get a spark dataframe using spark with scala从这里我需要得到一个火花 dataframe 使用火花 scala
+---+---------+--------+
|age|firstName|lastName|
+---+---------+--------+
|30 |James |Bond |
|25 |Super |Man |
+---+---------+--------+
I have tried converting to json and then to dataframe, but I am not able to specify it in a dataset sequence.我尝试转换为 json,然后转换为 dataframe,但我无法在数据集序列中指定它。
There is a solution, that will help you convert your yaml to json and then read it as a DataFrame有一个解决方案,可以帮助您将 yaml 转换为 json,然后将其读取为 DataFrame
You need to add this 2 dependencies:您需要添加这 2 个依赖项:
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.dataformat.yaml.YAMLFactory
class ScalaYamltoDataFrame {
val yamlExample = "- firstName: \"James\"\n lastName: \"Bond\"\n age: 30\n\n- firstName: \"Super\"\n lastName: \"Man\"\n age: 25"
def convertYamlToJson(yaml: String): String = {
val yamlReader = new ObjectMapper(new YAMLFactory)
val obj = yamlReader.readValue(yaml, classOf[Any])
val jsonWriter = new ObjectMapper
jsonWriter.writeValueAsString(obj)
}
println(convertYamlToJson(yamlExample))
def yamlToDF(): Unit = {
@transient
lazy val sparkSession = SparkSession.builder
.master("local")
.appName("Convert Yaml to Dataframe")
.getOrCreate()
import sparkSession.implicits._
val ds = sparkSession.read
.option("multiline", true)
.json(Seq(convertYamlToJson(yamlExample)).toDS)
ds.show(false)
ds.printSchema()
}
//println(convertYamlToJson(yamlExample))
[{"firstName":"James","lastName":"Bond","age":30},{"firstName":"Super","lastName":"Man","age":25}]
//ds.show(false)
+---+---------+--------+
|age|firstName|lastName|
+---+---------+--------+
|30 |James |Bond |
|25 |Super |Man |
+---+---------+--------+
//ds.printSchma()
root
|-- age: long (nullable = true)
|-- firstName: string (nullable = true)
|-- lastName: string (nullable = true)
Hope this helps !希望这可以帮助 !
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.