![](/img/trans.png)
[英]How to fix the error mismatched input 'partition' for window functions in spark sql?
[英]Migrating window functions from SQL to spark scala
這是一些 SQL 表達式,我正在嘗試遷移以激發 scala。
SELECT
a.senderId,
b.company_id,
ROW_NUMBER() OVER(PARTITION BY a.senderId ORDER BY b.chron_rank) AS rnk
FROM df1 a
JOIN df2 b
ON a.senderId = b.member_id
WHERE a.datepartition BETWEEN concat(b.start_date,'-00') AND concat(b.end_date,'-00')
我對 window function 有點迷茫,我開始這樣的事情,
val temp = df2.join(df1, $"dimPosition.member_id" === $"df1.senderId")
.select($"df1.senderId", $"df2.company_id")
.......
嘗試這個-
df2.as("b")
.join(df1.as("a"), $"a.senderId" === $"b.member_id" && $"a.datepartition".between(
concat($"b.start_date",lit("-00")), concat($"b.end_date", lit("-00")))
)
.selectExpr("a.senderId",
"b.company_id",
"ROW_NUMBER() OVER(PARTITION BY a.senderId ORDER BY b.chron_rank) AS rnk")
試試這個..可能你會面臨where子句的問題..
val temp = df2.join(df1, $"dimPosition.member_id" === $"df1.senderId")
.select($"df1.senderId", $"df2.company_id")
.withColumn('rnk', ROW_NUMBER() OVER Window.partitionBy("senderId",")
.orderBy("chron_rank"))
.where(datepartition BETWEEN concat(b.start_date,'-00') AND concat(b.end_date,'-00'))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.