[英]How to map spark DataFrame row values to columns?
I am trying to map values in rows to columns in another dataframe.我正在尝试将行中的值映射到另一个数据框中的列。
I have the following DataFrame, the values in "id" are known to be unique:我有以下 DataFrame,已知“id”中的值是唯一的:
sqlContext.createDataFrame(Seq(("a", 1),("b",2))).toDF("id","number")
And:和:
sqlContext.createDataFrame(Seq(("jane",10),("John",12))).toDF("mcid", "age")
And I wish to produce a DataFrame with the schema:我希望使用架构生成一个 DataFrame:
| mcid | age | a | b |
I have no idea what you are try to do, but assuming you have this:我不知道你想做什么,但假设你有这个:
val df1 = sqlContext.createDataFrame(Seq(("a", 1),("b",2))).toDF("id","number")
val df2 = sqlContext.createDataFrame(Seq(("jane",10),("John",12))).toDF("mcid", "age")
This will get you a DataFrame
with the schema you are looking for:这将为您提供一个
DataFrame
您正在寻找的架构的DataFrame
:
df2.join(df1).groupBy($"mcid", $"age").pivot("id").sum("number")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.