I have one column and I need to find out date difference in days between each row, partitioned by Id.. This have to be done using Spark SQL. I have written below code but somehow the answer is coming wrong. Kindly let me know where am I going wrong.
WindowSpec window = Window.partitionBy("id").orderBy("date_time");
Dataset<Row> resultSet = testData.withColumn("day_diff", functions.datediff(col("date_time"), functions.lag(col("date_time"), 1).over(window)));
You should probably do it one by one.
testData
.withColumn("prev_date", functions.lag(col("date_time"),1).over(window))
.withColumn("day_diff", functions.datediff(col("date_time")), col("prev_date"))
.drop(col("prev_date"))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.