简体   繁体   English

如何将 spark DataFrame 行值映射到列?

[英]How to map spark DataFrame row values to columns?

I am trying to map values in rows to columns in another dataframe.我正在尝试将行中的值映射到另一个数据框中的列。

I have the following DataFrame, the values in "id" are known to be unique:我有以下 DataFrame,已知“id”中的值是唯一的:

sqlContext.createDataFrame(Seq(("a", 1),("b",2))).toDF("id","number")

And:和:

sqlContext.createDataFrame(Seq(("jane",10),("John",12))).toDF("mcid", "age")

And I wish to produce a DataFrame with the schema:我希望使用架构生成一个 DataFrame:

| mcid | age | a | b |

I have no idea what you are try to do, but assuming you have this:我不知道你想做什么,但假设你有这个:

val df1 = sqlContext.createDataFrame(Seq(("a", 1),("b",2))).toDF("id","number")
val df2 = sqlContext.createDataFrame(Seq(("jane",10),("John",12))).toDF("mcid", "age")

This will get you a DataFrame with the schema you are looking for:这将为您提供一个DataFrame您正在寻找的架构的DataFrame

df2.join(df1).groupBy($"mcid", $"age").pivot("id").sum("number")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将新列和相应的行特定值添加到火花 dataframe? - How to add new columns and the corresponding row specific values to a spark dataframe? 如何将Spark Dataframe列嵌入到Map列? - How to embed spark dataframe columns to a map column? 如何将带有键的值映射到 Spark DataFrame 中的列 - How to map values with a key to a column in a Spark DataFrame 如何聚合火花数据框中 2 列的值 - How to aggregate the values of 2 columns in a spark dataframe Spark Dataframe用Null替换一行中特定列的值 - Spark dataframe replace values of specific columns in a row with Nulls 如何从apache-spark的数据帧的列中重复值中仅选择第一行? - How to select only first row from repeating values in columns of dataframe in apache-spark? 如何将 Spark dataframe 列与另一个 dataframe 列值进行比较 - How to compare Spark dataframe columns with another dataframe column values 在Spark和Scala中,如何将数据框转换或映射到特定的列信息? - In spark and scala, how to convert or map a dataframe to specific columns info? 如何在 Spark Dataframe 中将几列的值合并为 map - How to merge the value of several columns into a map in Spark Dataframe 当Row包含Map [Map]时如何从RDD [Row]创建Spark DataFrame - How to create Spark DataFrame from RDD[Row] when Row contains Map[Map]
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM