简体   繁体   English

如何使用 Scala 从 JSON 文件中读取字符串列表

[英]How to read list of string from JSON file using Scala

val df = spark.read.option("multiline", "true").json("/FileStore/tables/config-5.json")
    
df.show()

Output:输出:

+--------------+-------------------+
|      List-col|            Matrics|
+--------------+-------------------+
|[number, word]|ApproxCountDistinct|
|[number, word]|       Completeness|
+--------------+-------------------+

Code:代码:

for (row <- df.rdd.collect) {   
    var List_col =(row(0))
    var Matricsdynamic = row(1)
    List_col.foreach(c =>print(c) )

    //MatricsCal.ApproxCountDistinct_func(listofStr)
}     

As List-col is supposed to be a list of string I am getting WrappedArray(number, word)WrappedArray(number, word).因为 List-col 应该是一个字符串列表,所以我得到 WrappedArray(number, word)WrappedArray(number, word)。 I need list(String).我需要列表(字符串)。

I assume you need get second element from List-col, is so you can get it:我假设您需要从 List-col 获取第二个元素,以便您可以获取它:

import scala.collection.mutable
import spark.implicits._
val df = Seq(
  (List("24", "text1"), "metric1"),
  (List("12", "text2"), "metric2"),
  (List("53", "text2"), "metric3"),
  (List("13", "text3"), "metric4"),
  (List("64", "text4"), "metric5")
).toDF("List-col", "Matrics")
val result: Array[String] = df.map{
  row =>
    row.get(0) match {
      case t:mutable.WrappedArray[AnyRef] => t.last.toString
    }
}.collect()
println(result.mkString("Array(", ", ", ")")) // Array(text1, text2, text2, text3, text4)

You should be able to convert easily to a List of String, using toList method of WrappedArray .你应该能够很容易地转换为字符串的列表,使用toList的方法WrappedArray

Assuming your JSON file looks something like:假设您的 JSON 文件类似于:

{"List-col": [9, "word1"], "Matrics": "ApproxCountDistinct"}
{"List-col": [10, "word2"], "Matrics": "Completeness"}

You can get back an array of records, each record being a List[String] .您可以取回一组记录,每条记录都是一个List[String]

import org.apache.spark.sql._
import org.apache.spark.sql.functions.col
val lists:Array[List[String]] = df.select(col("List-col")).collect.map(
                               (r: Row) => r.getAs[WrappedArray[String]](0).toList)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM