簡體   English   中英

Spark Dataframes:使用Window PARTITION函數語法時的CASE語句

[英]Spark Dataframes : CASE statement while using Window PARTITION function Syntax

我需要檢查一個條件,如果ReasonCode是否為“ YES”,則將ProcessDate用作PARTITION列之一,否則不使用。

等效的SQL查詢如下:

SELECT PNum, SUM(SIAmt) OVER (PARTITION BY PNum,
                                           ReasonCode , 
                                           CASE WHEN ReasonCode = 'YES' THEN ProcessDate ELSE NULL END 
                              ORDER BY ProcessDate RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) SumAmt 
from TABLE1

到目前為止,我已經嘗試了以下查詢,但無法合並該條件

Spark數據幀中的“ CASE WHEN ReasonCode ='YES',然后ProcessDate ELSE NULL END”

val df = inputDF.select("PNum")
.withColumn("SumAmt", sum("SIAmt").over(Window.partitionBy("PNum","ReasonCode").orderBy("ProcessDate")))

輸入數據:

---------------------------------------
Pnum    ReasonCode  ProcessDate SIAmt
---------------------------------------
1       No          1/01/2016   200
1       No          2/01/2016   300
1       Yes         3/01/2016   -200
1       Yes         4/01/2016   200
---------------------------------------

預期產量:

---------------------------------------------
Pnum    ReasonCode  ProcessDate SIAmt  SumAmt
---------------------------------------------
1       No          1/01/2016   200     200 
1       No          2/01/2016   300     500
1       Yes         3/01/2016   -200    -200
1       Yes         4/01/2016   200      200
---------------------------------------------

關於Spark數據框而不是spark-sql查詢的任何建議/幫助嗎?

您可以應用與api形式相同的SQL完全相同的副本

import org.apache.spark.sql.functions._
import org.apache.spark.sql.expressions._
val df = inputDF
  .withColumn("SumAmt", sum("SIAmt").over(Window.partitionBy(col("PNum"),col("ReasonCode"), when(col("ReasonCode") === "Yes", col("ProcessDate")).otherwise(null)).orderBy("ProcessDate")))

您也可以添加.rowsBetween(Long.MinValue, 0)部分,這應該給您

+----+----------+-----------+-----+------+
|Pnum|ReasonCode|ProcessDate|SIAmt|SumAmt|
+----+----------+-----------+-----+------+
|   1|       Yes|  4/01/2016|  200|   200|
|   1|        No|  1/01/2016|  200|   200|
|   1|        No|  2/01/2016|  300|   500|
|   1|       Yes|  3/01/2016| -200|  -200|
+----+----------+-----------+-----+------+

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM