简体   繁体   中英

Need help on sub-query of Spark-SQL Databricks

I have below mentioned SQL and getting below mentioned dataset as result. But i want to display only one Open status record which has MIN date.

SELECT distinct o.svc_ord_nbr AS SVC_ORD_NBR,
  o.svc_ord_stat_nm AS SVC_ORD_STAT_NM,
  min(t.start_date_est) AS STRT_DT_EST, t.status_text
FROM A o inner join B t on t.ticket=o.notif_nbr
  and o.svc_ord_nbr in ('021519_574819','110714_246149')
Group by o.svc_ord_nbr, o.svc_ord_stat_nm, t.status_text

The Result dataset looks like this: 在此处输入图像描述

I want only the first row which is having MIN of STRT_DT_EST. Thanks in Advance...

Have you tried with window functions for this use case.

spark.sql(
 “””
 |SELECT a.*,
 |ROW_NUMBER() OVER(PARTITION BY dept ORDER BY salary) as rn,
 |RANK() OVER(PARTITION BY dept ORDER BY salary) as rank,
 |DENSE_RANK() OVER(PARTITION BY dept ORDER BY salary) as dense_rank,
 |PERCENT_RANK() OVER(PARTITION BY dept ORDER BY salary) as percent_rank,
 |NTILE(3) OVER(PARTITION BY dept ORDER BY salary) as ntile
 |FROM employee a
 |”””.stripMargin).show(false)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM