[英]Date Difference within Same Column Apache Spark
I have one column and I need to find out date difference in days between each row, partitioned by Id.. This have to be done using Spark SQL.我有一列,我需要找出每行之间的日期差异,按 Id 分区。这必须使用 Spark SQL 完成。 I have written below code but somehow the answer is coming wrong.
我写了下面的代码,但不知何故答案是错误的。 Kindly let me know where am I going wrong.
请让我知道我哪里出错了。
WindowSpec window = Window.partitionBy("id").orderBy("date_time");
Dataset<Row> resultSet = testData.withColumn("day_diff", functions.datediff(col("date_time"), functions.lag(col("date_time"), 1).over(window)));
You should probably do it one by one.你应该一个一个地做。
testData
.withColumn("prev_date", functions.lag(col("date_time"),1).over(window))
.withColumn("day_diff", functions.datediff(col("date_time")), col("prev_date"))
.drop(col("prev_date"))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.