[英]SQL-query error in pyspark while using temp-table
我有一個SQL查詢,我必須在PySpark(DataBricks)中進行訪問。 由於查詢復雜,PySpark無法讀取相同內容。 有人可以檢查我的查詢並協助我以不使用“ WITH”語句的單個“ SELECT”語句編寫此查詢。
Stage:- 1
promotions="""
(WITH VCTE_Promotions as (SELECT v.Shortname, v.Employee_ID_ALT, v.Job_Level,
v.Management_Level, CAST(sysdatetime() AS date) AS PIT_Date, v.Employee_Status_Alt as Employee_Status,
v.Work_Location_Region, v.Work_Location_Country_Desc, v.HML,
[DM_GlobalStaff].[dbo].[V_Worker_PIT].Is_Manager
FROM [DM_GlobalStaff].[dbo].[V_Worker_CUR] as v
LEFT OUTER JOIN
[DM_GlobalStaff].[dbo].[V_Worker_PIT] ON v.Management_Level = [DM_GlobalStaff].[dbo].[V_Worker_PIT].Management_Level),
VCTE_Promotion_v2_Eval as (
SELECT Employee_ID_ALT,
( SELECT max([pit_date]) AS prior_data
FROM [DM_GlobalStaff].[dbo].[V_Worker_PIT] AS t
WHERE (employee_id_alt = a.Employee_ID_ALT) AND (PIT_Date < a.PIT_Date) AND (Is_Manager <> a.Is_Manager) OR
(employee_id_alt = a.Employee_ID_ALT) AND (PIT_Date < a.PIT_Date) AND (Job_Level <> a.Job_Level)) AS prev_job_change_date, Is_Manager
FROM VCTE_Promotions AS a)
SELECT VCTE_Promotion_v2_Eval.Employee_ID_ALT, COALESCE (v_cur.Employee_Status_ALT, N'') AS Curr_Emp_Status,
COALESCE (v_cur.Employee_Type, N'') AS Curr_Employee_Type, v_cur.Hire_Date_Alt AS Curr_Hire_Date,
v_cur.Termination_Date_ALT AS Curr_Termination_Date, COALESCE (v_cur.Termination_Action_ALT, N'')
AS Curr_Termination_Action, cast (v_cur.Job_Level as int) AS Curr_Job_Level,
COALESCE (v_cur.Management_Level, N'') AS Curr_Management_Level,
COALESCE (VCTE_Promotion_v2_Eval.Is_Manager, N'') AS Curr_Ismanager,
CASE WHEN v_m.Job_Level < v_cur.Job_Level OR
(VCTE_Promotion_v2_Eval.Is_Manager = 1 AND v_m.Is_Manager = 0 AND v_m.Job_Level <= v_cur.Job_Level)
THEN 'Promotion' WHEN v_m.Job_Level <> v_cur.Job_Level OR
VCTE_Promotion_v2_Eval.Is_Manager <> v_m.Is_Manager THEN 'Other' ELSE '' END AS Promotion, v_cur.Tenure,
v_cur.Review_Rating_Current
FROM VCTE_Promotion_v2_Eval INNER JOIN
[DM_GlobalStaff].[dbo].[V_Worker_CUR] as v_cur ON VCTE_Promotion_v2_Eval.Employee_ID_ALT = v_cur.Employee_ID_ALT LEFT OUTER JOIN
[DM_GlobalStaff].[dbo].[V_Worker_PIT] as v_m ON VCTE_Promotion_v2_Eval.prev_job_change_date = v_m.PIT_Date AND
VCTE_Promotion_v2_Eval.Employee_ID_ALT = v_m.employee_id_alt
) as pr """
stage-2
promotions = spark.read.jdbc(url=jdbcUrl, table=promotions, properties=connectionProperties)
stage-3
promotions.count()
promotions.show()
從Stage-2查詢中獲取以下錯誤:-
com.microsoft.sqlserver.jdbc.SQLServerException: Incorrect syntax near the keyword 'WITH'.
---------------------------------------------------------------------------
Py4JJavaError Traceback (most recent call last)
<command-2532359884208251> in <module>()
----> 1 promotions = spark.read.jdbc(url=jdbcUrl, table=promotions, properties=connectionProperties)
/databricks/spark/python/pyspark/sql/readwriter.py in jdbc(self, url, table, column, lowerBound, upperBound, numPartitions, predicates, properties)
533 jpredicates = utils.toJArray(gateway, gateway.jvm.java.lang.String, predicates)
534 return self._df(self._jreader.jdbc(url, table, jpredicates, jprop))
--> 535 return self._df(self._jreader.jdbc(url, table, jprop))
536
537
我的查詢沒有問題,這在我的SQL提示符下工作得很好。 但是,一旦我在PYSPARK(DataBricks)中使用相同的查詢,我就會收到語法錯誤。 您也可以通過PySpark語法幫助我。
您的及時協助將不勝感激。
我沒有測試的方法,但是請嘗試一下,然后比較結果以查看是否一切都匹配。
另外,由於沒有簡單的聯接並且相關子查詢效率不高,因此我使用了交叉應用而不是相關子查詢,因此,交叉應用應能勝任
(
SELECT
VCTE_Promotion_v2_Eval.Employee_ID_ALT
,COALESCE(v_cur.Employee_Type, N'') AS Curr_Employee_Type
,v_cur.Review_Rating_Current
(
SELECT
Employee_ID_ALT,
pr.prev_job_change_date,
IsManager
From
( SELECT
v.Shortname
,v.Employee_ID_ALT
,v.Job_Level
,v.Management_Level
,CAST(SYSDATETIME() AS DATE) AS PIT_Date
,v.Employee_Status_Alt AS Employee_Status
,v.Work_Location_Region
,v.Work_Location_Country_Desc
,v.HML
,dbo.T_Mngmt_Level_IsManager_Mapping.IsManager
FROM Worker_CUR AS v
LEFT OUTER JOIN dbo.T_Mngmt_Level_IsManager_Mapping
ON v.Management_Level = dbo.T_Mngmt_Level_IsManager_Mapping.Management_Level
) as VCTE_Promotions a
Cross APPLY (
SELECT
MAX(PIT_Date) AS prior_data
FROM dbo.V_Worker_PIT_with_IsManager AS t
WHERE (employee_id_alt = a.Employee_ID_ALT)
AND (PIT_Date < a.PIT_Date)
AND (IsManager <> a.IsManager)
OR (employee_id_alt = a.Employee_ID_ALT)
AND (PIT_Date < a.PIT_Date)
AND (Job_Level <> a.Job_Level)
)
AS pr
) as VCTE_Promotion_v2_Eval
INNER JOIN [DM_GlobalStaff].[dbo].[V_Worker_CUR] AS v_cur
ON VCTE_Promotion_v2_Eval.Employee_ID_ALT = v_cur.Employee_ID_ALT
LEFT OUTER JOIN dbo.V_Worker_PIT_with_IsManager AS v_m
ON VCTE_Promotion_v2_Eval.prev_job_change_date = v_m.PIT_Date
AND VCTE_Promotion_v2_Eval.Employee_ID_ALT = v_m.employee_id_alt ) as promotions
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.