将JSON数据添加到Scala中的多行字符串以使用Spark处理

Question

I am trying to utilize some parameters which are in multiline single json object in a json file stored on s3. 我试图利用存储在s3上的json文件中的多行单个json对象中的一些参数。 However, because I am facing several issues for reading and parsing json in spark(honestly, its pain...), I tried using jackson converted a hardcoded multiline json to map as: 但是，由于我在读取和解析spark中的json时遇到了几个问题（老实说，这很痛苦...），我尝试使用杰克逊将硬编码的多行json转换为映射为：

Following is my json hardcoded as multiline string: 以下是我的json硬编码为多行字符串：

val jsonString = 
    """
        {
          myJSON
        }
    """

I used jackson binder to decode it: 我使用杰克逊活页夹对其进行解码：

    val mapper = new ObjectMapper
    mapper.registerModule(DefaultScalaModule)
    mapper.readValue(jsonString, classOf[Map[String, String]])

Now I can use a map very easily. 现在我可以很容易地使用地图了。 Unfortunately all the code base uses a map, hence this method seems preferable to me. 不幸的是，所有代码库都使用映射，因此这种方法对我来说似乎更可取。

So I wanted to know if there is a way to create a multiline string with a json file in spark-scala? 所以我想知道是否有一种方法可以在spark-scala中用json文件创建多行字符串？ I will be fetching my json file from s3. 我将从s3获取我的json文件。

Answer 1

If you are not bounded by jackson, then you can try do it much easy and faster with jsoniter_scala . 如果您不受杰克逊（ Jackson）的束缚，那么可以尝试使用jsoniter_scala轻松，快速地完成操作。 Add dependencies to your build script. 将依赖项添加到您的构建脚本中。 Import and use them like here: 像下面这样导入和使用它们：

// import required packages
import java.io._
import com.github.plokhotnyuk.jsoniter_scala.macros._
import com.github.plokhotnyuk.jsoniter_scala.core._

// create JSON codec for your map
val codec = JsonCodecMaker.make[Map[String, String]](CodecMakerConfig())

// then read JSON file using it
val map = {
  val in: InputStream = // <- here can be any input stream implementation, no buffering required 
    new FileInputStream("/tmp/input.json")
  try JsonReader.read(codec, in)
  finally in.close()
}

将JSON数据添加到Scala中的多行字符串以使用Spark处理

问题描述

1 个解决方案

解决方案1
0 2018-02-01 20:39:43

将JSON数据添加到Scala中的多行字符串以使用Spark处理

问题描述

1 个解决方案

解决方案1 0 2018-02-01 20:39:43

解决方案1
0 2018-02-01 20:39:43