简体   繁体   中英

How to split the values in a map in Scala?

I have a map that has values that comes from multiple different columns in the database. The values have underscores in between. For example,

newMap("A", 23_null_12_09asfA) 

Here, 23 comes from column A and null from column B and so on. Now, consider a map that has 20 values. I want to know how to split these values into arrays or how to split and store them?

val baseRDD=sc.parallelize(List(("john","1_abc_2"),("jack","3_xyz_4")))
val sRDD = baseRDD.map(x=> x._2.split("_"))
val resultDF=sRDD.toDF
resultDF.show

|[1, abc, 2]|
|[3, xyz, 4]|

According to what I understand from your question and explanation, following can be your solution

val newMap: HashMap[String, String] = HashMap(("A", "23_null_12_09asfA"),
  ("B", "24_null_13_09asfB"),
  ("C", "25_null_14_09asfC"),
  ("D", "25_null_14_09asfC"),
  ("E", "25_null_14_09asfC"),
  ("F", "25_null_14_09asfC"),
  ("G", "25_null_14_09asfC"))

val schema = StructType(Array(StructField("col1", StringType, true),
  StructField("col2", StringType, true),
  StructField("col3", StringType, true),
  StructField("col2", StringType, true)))

val rdd = sparkContext.parallelize(newMap.map(hashmap => Row.fromSeq(hashmap._2.split("_"))).toSeq)

sqlContext.createDataFrame(rdd, schema).show

I hope it is helpful

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM