简体   繁体   English

使用 spark 读取多个 json 模式

[英]Read Multilple json schema with spark

Software Configuration:软件配置:

Hadoop distribution:Amazon 2.8.3
Applications:Hive 2.3.2, Pig 0.17.0, Hue 4.1.0, Spark 2.3.0

Tried to read with multiple json schema,尝试使用多个 json 模式读取,

val df = spark.read.option("mergeSchema", "true").json("s3a://s3bucket/2018/01/01/*") val df = spark.read.option("mergeSchema", "true").json("s3a://s3bucket/2018/01/01/*")

Throws an error,抛出错误,

org.apache.spark.sql.AnalysisException: Unable to infer schema for JSON. It must be specified manually.;
  at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$9.apply(DataSource.scala:207)
  at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$9.apply(DataSource.scala:207)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:206)
  at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:392)
  at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
  at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:397)
  at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:340)

How to read json with multipl schema's with spark?如何使用带火花的多重模式读取json?

当您指向错误的路径时(当数据不存在时),有时会发生这种情况。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM