繁体   English   中英

火花更改 DF 模式列从点重命名为下划线

[英]spark change DF schema column rename from dot to underscore

我有一个 dataframe 列名有dot 示例:df.printSchema

user.id_number
user.name.last
user.phone.mobile

等,我想通过用_替换dot来重命名架构。

user_id_number
user_name_last
user_phone_mobile

注意:这个 DF 的输入数据是 JSON 格式(与NoSQL等非关系数据)

使用.map,.withColumnRenamed替换. _

Example:

val df=Seq(("1","2","3")).toDF("user.id_number","user.name.last","user.phone.mobile")
df.toDF(df.columns.map(x =>x.replace(".","_")):_*).show()

//using replaceAll
df.toDF(df.columns.map(x =>x.replaceAll("\\.","_")):_*).show()
//+--------------+--------------+-----------------+
//|user_id_number|user_name_last|user_phone_mobile|
//+--------------+--------------+-----------------+
//|             1|             2|                3|
//+--------------+--------------+-----------------+

2. Using selectExpr:

val expr=df.columns.map(x =>col(s"`${x}`").alias(s"${x}".replace(".","_")).toString)

df.selectExpr(expr:_*).show()
//+--------------+--------------+-----------------+
//|user_id_number|user_name_last|user_phone_mobile|
//+--------------+--------------+-----------------+
//|             1|             2|                3|
//+--------------+--------------+-----------------+

3.Using.withColumnRenamed:

df.columns.foldLeft(df){(tmpdf,col) =>tmpdf.withColumnRenamed(col,col.replace(".","_"))}.show()
//+--------------+--------------+-----------------+
//|user_id_number|user_name_last|user_phone_mobile|
//+--------------+--------------+-----------------+
//|             1|             2|                3|
//+--------------+--------------+-----------------+

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM