简体   繁体   中英

How to add some values in a dataframe in Scala Spark?

Here is the dataframe I have for now, suppose there are totally 4 days{1,2,3,4}:

+-------------+----------+------+
| key         | Time     | Value|
+-------------+----------+------+
|  1          |        1 | 1    | 
|  1          |        2 | 2    |
|  1          |        4 | 3    |
|  2          |        2 | 4    |
|  2          |        3 | 5    | 
+-------------+----------+------+

And what I want is

+-------------+----------+------+
| key         | Time     | Value|  
+-------------+----------+------+
|  1          |        1 | 1    | 
|  1          |        2 | 2    | 
|  1          |        3 | null | 
|  1          |        4 | 3    | 
|  2          |        1 | null |
|  2          |        2 | 4    |
|  2          |        3 | 5    | 
|  2          |        4 | null | 
+-------------+----------+------+

If there is some ways that can help me get this?

Say df1 is our main table:

+---+----+-----+
|key|Time|Value|
+---+----+-----+
|1  |1   |1    |
|1  |2   |2    |
|1  |4   |3    |
|2  |2   |4    |
|2  |3   |5    |
+---+----+-----+

We can use the following transformations:

val data = df1
  // we first group by and aggregate the values to a sequence between 1 and 4 (your number)
  .groupBy("key")
  .agg(sequence(lit(1), lit(4)).as("Time"))
  // we explode the sequence, thus creating all 'Time' per 'key'
  .withColumn("Time", explode(col("Time")))
  // finally, we join with our main table on 'key' and 'Time'
  .join(df1, Seq("key", "Time"), "left")

To get this output:

+---+----+-----+
|key|Time|Value|
+---+----+-----+
|1  |1   |1    |
|1  |2   |2    |
|1  |3   |null |
|1  |4   |3    |
|2  |1   |null |
|2  |2   |4    |
|2  |3   |5    |
|2  |4   |null |
+---+----+-----+

Which should be what you are looking for, good luck!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM