简体   繁体   English

如何转换JSON数组 <String> 到Spark SQL中的csv

[英]How to convert json array<String> to csv in spark sql

I have tried this query to get required experience from linkedin data. 我已经尝试过此查询以从linkedin数据中获得所需的经验。

单击此处获取我的数据

 Dataset<Row> filteredData = spark
                    .sql("select full_name ,experience from (select *, explode(experience['title']) exp from tempTable )"
                            + "  a where lower(exp) like '%developer%'");

But I got this error: 但是我得到了这个错误:

点击这里我得到的错误

and finally I tried but I got more rows with the same name . 最后我尝试了但是我得到了更多同名的行。

Dataset<Row> filteredData = spark
                    .sql("select full_name ,explode(experience) from (select *, explode(experience['title']) exp from tempTable )"
                            + "  a where lower(exp) like '%developer%'");

Please give me hint, how to convert array of string to comma separated string in the same column. 请给我提示,如何将字符串数组转换为同一列中的逗号分隔的字符串。

You can apply UDF for making a comma separate string 您可以应用UDF来创建逗号分隔的字符串

Create UDF like this 像这样创建UDF

def mkString(value: WrappedArray[String]): String = value.mkString(",")

Register UDF in sparkSQL context 在sparkSQL上下文中注册UDF

sqlContext.udf.register("mkstring", mkString _)

Apply it on SparkSQL query 将其应用于SparkSQL查询

sqlContext.sql(select mkstring(columnName) from tableName)

it will return comma separate value of array 它将返回逗号分隔的数组值

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM