Hi all I have a sql query as shown below with some comments and escape characters
Comments begin with -- and escape character \n is used
comments can start with -- and end with -- as well, but end -- is not mandatory
--Data Source for Tableau DQ Dashboard (Failure Summary) for IM \nWith DEP_DETAIL as\n(\n\nselect distinct D.SYS_NAME as SOR, A.YEAR_MONTH,A.SRC_INSTR_ID,C.data_element ,A.DATE_OF_BIRTH,A.STATUS_DESC,A.OD_REGE_OPT_CD,A.OD_REGE_OPT_CD,C.RULE_DESCRIPTION,A.PRINCIPAL_ENDING_BAL, SERVICING_COST_CENTER,A.PROD_LEVEL1_CD_DESC as level1_cd_desc\n,A.SRC_INSTR_OPEN_DT, EXTRACT(YEAR FROM SRC_INSTR_OPEN_DT) as YEAR_INSTR_OPEN, D.SYS_NAME, E.DQ_CATG_NAME, A.LOB, B.PASS_IND\nfrom UMA_RBG_DQ.F_DQ_IM_ACCT A\nInner Join UMA_RBG_DQ.F_KDE_RESULT_DETAIL B\nON A.SRC_INSTR_ID = B.SRC_INSTR_ID\nAND A.YEAR_MONTH = B.YEAR_MONTH\n\nINNER JOIN UMA_RBG_DQ.D_RULE C\nON B.RULE_ID = C.RULE_ID\nAND C.ACTIVE_FLAG = 'Y'\n\nInner Join UMA_RBG_DQ.D_SYSTEM D\nON C.SYSTEM_ID = D.SYSTEM_ID\nAND D.ACTIVE_FLAG = 'Y'\n\nInner Join D_DQ_CATG E\nON C.DQ_CATG_ID = E.DQ_CATG_ID\nWHERE A.YEAR_MONTH >= to_char(add_months(trunc(sysdate,'MM'),-12),'YYYYMM') -- 'Failure Trend' Dashboard shows failed and passed records for past 1 year, hence -12 from sysdate\n--AND C.RULE_ID <> 'RBG_DQ_1014'\n)\n\nSELECT\n year_month, Data_Element, rule_description as rule, DQ_CATG_NAME as dq_dimension, level1_cd_desc,\n SUM(CASE WHEN PASS_IND = 'N' THEN 1 END) AS failed_cnt,\n SUM(CASE WHEN PASS_IND = 'Y' THEN 1 END) AS passed_cnt,\n SUM(CASE WHEN PASS_IND = 'N' THEN 1 END) * 1.0 / COUNT(*) AS failed_perc,\n SUM(CASE WHEN PASS_IND = 'Y' THEN 1 END ) * 1.0 / COUNT(*) AS passed_perc\nFROM\n DEP_DETAIL\nGROUP BY year_month, Data_Element, rule_description, DQ_CATG_NAME, level1_cd_desc
I want the end result with only sql query excluding comments and escape characters as shown below.
With DEP_DETAIL as ( select distinct D.SYS_NAME as SOR, A.YEAR_MONTH,A.SRC_INSTR_ID,C.data_element ,A.DATE_OF_BIRTH,A.STATUS_DESC,A.OD_REGE_OPT_CD,A.OD_REGE_OPT_CD,C.RULE_DESCRIPTION,A.PRINCIPAL_ENDING_BAL, SERVICING_COST_CENTER,A.PROD_LEVEL1_CD_DESC as level1_cd_desc ,A.SRC_INSTR_OPEN_DT, EXTRACT(YEAR FROM SRC_INSTR_OPEN_DT) as YEAR_INSTR_OPEN, D.SYS_NAME, E.DQ_CATG_NAME, A.LOB, B.PASS_IND from UMA_RBG_DQ.F_DQ_IM_ACCT A Inner Join UMA_RBG_DQ.F_KDE_RESULT_DETAIL B ON A.SRC_INSTR_ID = B.SRC_INSTR_ID AND A.YEAR_MONTH = B.YEAR_MONTH INNER JOIN UMA_RBG_DQ.D_RULE C ON B.RULE_ID = C.RULE_ID AND C.ACTIVE_FLAG = 'Y' Inner Join UMA_RBG_DQ.D_SYSTEM D ON C.SYSTEM_ID = D.SYSTEM_ID AND D.ACTIVE_FLAG = 'Y' Inner Join D_DQ_CATG E ON C.DQ_CATG_ID = E.DQ_CATG_ID WHERE A.YEAR_MONTH >= to_char(add_months(trunc(sysdate,'MM'),-12),'YYYYMM') AND C.RULE_ID <> 'RBG_DQ_1014' ) SELECT year_month, Data_Element, rule_description as rule, DQ_CATG_NAME as dq_dimension, level1_cd_desc, SUM(CASE WHEN PASS_IND = 'N' THEN 1 END) AS failed_cnt, SUM(CASE WHEN PASS_IND = 'Y' THEN 1 END) AS passed_cnt, SUM(CASE WHEN PASS_IND = 'N' THEN 1 END) * 1.0 / COUNT(*) AS failed_perc, SUM(CASE WHEN PASS_IND = 'Y' THEN 1 END ) * 1.0 / COUNT(*) AS passed_perc FROM DEP_DETAIL GROUP BY year_month, Data_Element, rule_description, DQ_CATG_NAME, level1_cd_desc
How can I parse this sql query to get required result?
If I understood correctly you are looking to split your query into lines and only keep text on the left of any --
occurrences:
' '.join(part for line in query.split('\n') for part in line.split('--')[::2])
UPDATE :
Fixed to keep only odd positions after splitting by --
(since comments can be nested)
x = x.replace('\n', '').replace('--', '')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.