[英]How to convert array element to Map[String,String] in spark 2.2 with java
I am using spark 2.2 and java 1.8 我正在使用Spark 2.2和Java 1.8
sample XML format - 示例XML格式-
<?xml version="1.0" encoding="UTF-8" ?>
<!-- Generated by Oracle DVM Editor version 1.0 at [9/6/11 5:14 PM].
-->
<dvm name="CIHSubscriptionTypeMapping" xmlns="http://xmlns.oracle.com/dvm">
<description>
</description>
<columns>
<column name="SSPMW"/>
<column name="CIH"/>
</columns>
<rows>
<row>
<cell>ute.recordClass</cell>
<cell>sku_type</cell>
</row>
<row>
<cell>ute.name.en</cell>
<cell>name_en</cell>
</row>
</rows>
</dvm>
Reading XML file in spark with java API - 使用Java API在Spark中读取XML文件-
Dataset<Row> xmlDF =spark.read()
.format("com.databricks.spark.xml")
.options("rowTag","row")
.load("sample.xml");
xmlDF.printSchema()
root
|-- cell: array (nullable = true)
| |-- element: string (containsNull = true)
xmlDF.show(false)
cell
================
[ute.recordClass, sku_type]
[ute.name.en, name_en]
I want convert above "cell" column to Lookup Map(String,String) and later will use it for broadcast. 我想将“单元格”列上方的内容转换为Lookup Map(String,String),以后再将其用于广播。
Example - (ute.sku.price,list_price)...... 示例-(ute.sku.price,list_price)......
Can some one help on this?.Thanks. 有人可以帮上忙吗?
You can use map
inbuilt function as 您可以将
map
内置函数用作
import org.apache.spark.sql.*;
xmlDF.select(functions.map(functions.col("cell").getItem(0), functions.col("cell").getItem(1)).as("cell")).show(false);
which should give you 这应该给你
+-----------------------------+
|cell |
+-----------------------------+
|[ute.recordClass -> sku_type]|
|[ute.name.en -> name_en] |
+-----------------------------+
root
|-- cell: map (nullable = false)
| |-- key: string
| |-- value: string (valueContainsNull = true)
Update 更新资料
You commented as you want in just java map . 您只需在java map中发表评论即可 。 For that you can use following method
为此,您可以使用以下方法
List<Row> rows = xmlDF.select(functions.col("cell").getItem(0).as("key"), functions.col("cell").getItem(1).as("value")).collectAsList();
Map<String, String> hashMap = new HashMap<String, String>();
for(Row row : rows){
hashMap.put(row.getString(0), row.getString(1));
}
hashMap
is a java map. hashMap
是一个Java映射。
I hope the answer is helpful 我希望答案是有帮助的
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.