簡體   English   中英

Oracle - DB似乎打破了JDBC批量插入

[英]Oracle - DB Appears to Breaking Up JDBC Batch Insert

我們的一個ETL應用程序遇到了一個奇怪的問題。 實際上,該過程打開游標以從一個DB中提取數據,執行一些轉換,然后使用批量插入插入另一個DB。

對於ETL中的所有表,我們的提交間隔設置為1000行。 因此,在讀取每行1k行並執行轉換后,我們對目標數據庫執行單個批量插入(使用Java,Spring Batch,OJDBC7 v12.1.0.2)。

但是,有些表格非常慢。 我們首先確定FK已關閉(他們是)。 然后我們檢查以確保觸發器被禁用(它們是)。 我們添加了日志記錄以獲取每個批處理插入中的行(除了每個線程的最終插入之外,它是1000)。

最后,查詢v$sql ,似乎對於某些表,我們看到接近1000行/執行,這是我們所期望的。 然而,對於痛苦的桌子,它經常徘徊在一個 我們期望大多數表都有點在900以上,因為線程的最終提交可能沒有完整的1k行,但是在某些表上每次執行的極低行是一個真正令人頭疼的問題。

一些寬表(100多列)是有問題的,但其他表很好。 一些高度分區(100多個分區)的表很慢,但其他表很好。 所以我很困惑。 誰看過這個嗎? 我的想法已經用完了!

謝謝!

這是我們在v$sql中看到的內容(表名混淆):

SELECT *
FROM
  (SELECT REGEXP_SUBSTR (sql_text, 'Insert into [^\(]*') sql_text,
    sql_id,
    TRUNC(
    CASE
      WHEN SUM (executions) > 0
      THEN SUM (rows_processed) / SUM (executions)
    END,2) rows_per_execution
  FROM v$sql
  WHERE parsing_schema_name = 'PFT010_RAPP_PRO'
  AND sql_text LIKE 'Insert into%'
  GROUP BY sql_text,
    sql_id
  )
ORDER BY rows_per_execution ASC;

SQL_STATEMENT                                   SQL_ID         ROWS_PER_EXECUTION
---------------------------------------------------------------------------------
Insert into C__PFT010.S_T___V_R_L_A_            agwu1dd1wr2ux     1.04
Insert into C__PFT010.S_T___G_L_A___T_          7ymw7jtdd9g53     1.25
Insert into C__PFT010.S_T___F_L_A_              7cynt9fmtpz83     1.44
Insert into C__PFT010.S_T___Q_L_A___A_          27v3fuj028cy6     1.57
Insert into C__PFT010.S_T___E_R_P_Y_A_P_S_A_    2t544j11a286z     1.80
Insert into C__PFT010.S_T___I_S_R_              anu8aac070sut     1.84
Insert into C__PFT010.S_T___R_C_R___T_T_        0ydz33s6guvcn     2.05
Insert into C__PFT010.S_T_R___D_R_P_Y_A_P_      7y76r10dmzqvh     2.14
Insert into C__PFT010.S_T___S_L_A___Y_T_S_S_    d7136fg9w033w     2.25
Insert into C__PFT010.S_T___R_C_R___T_T_        2pswt3cmp48s4     2.31
Insert into C__PFT010.S_T___F_R_P_Y_A_P_S_P_    170c7v23yyrms     2.46
Insert into C__PFT010.C_M_N_C___R_S_            fw3wbt4p08kx4     2.66
Insert into C__PFT010.T_A_H_N_T___E_A_Y_        dk5rwm58qqy8b     2.68
Insert into C__PFT010.O_G_L_A___N_O_            gtd4azc32gku4     3.05
Insert into C__PFT010.N_L_S_D___I_B_S_G_        a1a01vthwf2yk     3.15
Insert into C__PFT010.S_T___Q_L_A___A_          7ac6dqwb1jfyh     3.56
Insert into C__PFT010.S_T___J_P_M___A_A_        8n5z68bgkuan1     3.88
Insert into C__PFT010.S_T_R___F_R_P_Y_A_P_S_P   1r62s9qgjucy8     4.25
Insert into C__PFT010.L_A___W_E_S_I_            19rxcmgvct74c     4.28
Insert into C__PFT010.C___U___T_D_T_P_          fdzfdbpdzd18c     4.40
Insert into C__PFT010.S_T_R___U_T_A_S_E_        gs6z5szk9x1n2     4.61
Insert into C__PFT010.S_T_R___H_S_B_I_Y_L_S_    0zsz69pa3ahga     6.58
Insert into C__PFT010.C___F___U_R_P_T_          13xgutdszxab1     8.00
Insert into C__PFT010.S_T_R___J_P_M___A_A_      355gqx1sspdr0    20.19
Insert into C__PFT010.C___D___O___V_            4dmu2bqrra0fg    22.40
Insert into C__PFT010.S_T_R___Q_L_A___A_        dsx0nsrxkz5cf    36.14
Insert into C__PFT010.S_T___V_R_L_A___E_R_      2urs0mbjn3nm2   126.96
Insert into C__PFT010.S_S_C_S___E_A_L_S_G_      awq4fzkk3rsww   179.48
Insert into C__PFT010.S_S_D_S___C_I_I_Y_S_G_    7hpw0kv2z5nsh   417.87
Insert into C__PFT010.S_T_R___D_P_S___M_I_      cjgdmgfznapdk   502.36
Insert into C__PFT010.C___F___E_                6hv4smzmm4hx8   531.00
Insert into C__PFT010.N_L_S_E_R___R_            61zu9j25kgn2u   533.50
Insert into C__PFT010.S_T___B_P_S___A_T_R_      31xpaj7afk054   714.94
Insert into C__PFT010.S_T_R___C_L_A___O_G_V_    dx4mna12hdh9c   749.66
Insert into C__PFT010.S_T___C_P_S___D_R_S_      b7z4y1mruk714   784.56
Insert into C__PFT010.S_T___S_L_A___Y_T_S_S_    29qbqkzhmt83h   792.63
Insert into C__PFT010.A_H_C_R_T_                c6kmyt3a410ch   801.67
Insert into C__PFT010.S_T___X_P_S___H_N_        g6cbtus4bccm8   826.19
Insert into C__PFT010.S_T___K_R_B_T___T_T_      0xps4ddmw322h   873.36
Insert into C__PFT010.C___O___C_L___M_          fz91ju8jw22yc   928.90
Insert into C__PFT010.S_T___H_L_A___T_T_        44rh8722j51fm   982.16
Insert into C__PFT010.C___C_L_S_C_R_T_          4vpnstj8qxy80   991.75
Insert into C__PFT010.S_T___P_L_A___E_U_D_      fgunfbpddf2af   994.50
Insert into C__PFT010.S_T___A_S___I___O_S_      0d0x5ymp2y248   996.09
Insert into C__PFT010.S_T___K_R_B_T___T_T_      61rmgzvqrbudh   999.25
Insert into C__PFT010.S_T___D_P_S___M_I_        bu3hc03yugc8h   999.88
Insert into C__PFT010.L___R_E___E_L_R_C_P_2_00  bvrxzq2v3npc6   999.91
Insert into C__PFT010.N_L_S_G_A_T_S_N___R_C     7sj2ydm7m2z6u   999.96
Insert into C__PFT010.S_T___V_R_L_A___E_E_E     8n6nbsjfpvu70   999.98
Insert into C__PFT010.S_T___L_I_T_B_N_F_T       5b89j9um2jkuu   999.98
Insert into C__PFT010.S_T___D_P_S___M_I_        906jnw4jarsxk   999.98
Insert into C__PFT010.S_T___T_E_R_M_T           9a8vnhnbp5jpn  1000.00

更新:此時數據有點陳舊(所有快速線程都已完成),但這里有一些帶有SQL ID,執行計數和行/執行的計數。 所有這些表都有(或將有)數千萬行

SQL ID          Executions  Rows/Execution
agwu1dd1wr2ux   118043      1.04
anu8aac070sut   194768      1.84
dr8qxkcx1xybj   11635084    1.85
a37vqfjqcyd3j   4939754     2.36
8n5z68bgkuan1   2642091     3.95
4sps6y4bkkr6p   268739      13.77
5tdhpn96vpz6d   240227      166.85

其他SQL跟蹤數據......:

這是一個表格的插入,工作得很好

PARSING IN CURSOR #139935830739792 len=315 dep=0 uid=845 oct=2 lid=845 tim=2116001679604 hv=581690290 ad='c168de130' sqlid='906jnw4jarsxk'
Insert into ___PFT010.S_T__A__P_S__E_A_L
 (A_A_D_ID, CREATE_DTM, DOC_TXN_ID, EFF_DT, EFF_END_DT, EFF_START_DT, EXTRACT_DT, G_O__A_D__F_G, MAINT_DTM, MAINT_USERID, P_R_O__E_A_L, P_R_O__R_L_, R_C_RD_T_P, S__ID, S_R_I_E__ID )
 values (:1 , :2 , :3 , :4 , :5 , :6 , :7 , :8 , :9 , :10 , :11 , :12 , :13 , :14 , :15  )
END OF STMT
PARSE #139935830739792:c=0,e=25,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=0,tim=2116001679603
WAIT #139935830739792: nam='SQL*Net more data from client' ela= 72 driver id=675562835 #bytes=3 p3=0 obj#=-1 tim=2116001679871
WAIT #139935830739792: nam='db file sequential read' ela= 551 file#=99 block#=78343664 blocks=1 obj#=1255124 tim=2116001680643
* * * * * * * * * * * * * * * * * * 
* * * a bunch more of these
* * * * * * * * * * * * * * * * * * 
WAIT #139935830739792: nam='db file sequential read' ela= 750 file#=99 block#=66416561 blocks=1 obj#=1255124 tim=2116001788121
WAIT #139935830739792: nam='db file sequential read' ela= 176 file#=99 block#=45513746 blocks=1 obj#=1255124 tim=2116001787117
WAIT #139935830739792: nam='db file sequential read' ela= 750 file#=99 block#=66416561 blocks=1 obj#=1255124 tim=2116001788121
* * * * * * * * * * * * * * * * * * 
* * * r=1000, indicating 1000 rows were written 
* * * * * * * * * * * * * * * * * * 
EXEC #139935830739792:c=57991,e=109295,p=131,cr=69,cu=3313,mis=0,r=1000,dep=0,og=1,plh=0,tim=2116001788944
STAT #139935830739792 id=1 cnt=0 pid=0 pos=1 obj=0 op='LOAD TABLE CONVENTIONAL  SAT1_AD_PRSN_EMAIL (cr=69 pr=131 pw=0 time=109260 us)'
XCTEND rlbk=0, rd_only=0, tim=2116001789025
CLOSE #139935830739792:c=0,e=12,dep=0,type=1,tim=2116016169474

這是令人討厭的一個。 這次,它只在執行中獲得1行

PARSING IN CURSOR #139935830737584 len=520 dep=0 uid=845 oct=2 lid=845 tim=2116016176184 hv=1904916192 ad='97e96dc98' sqlid='355gqx1sspdr0'
Insert into ___PFT010.S_TE_R_BJ_P_M__D_T_
 (A_A_D_ID, CREATE_DTM, DOC_TXN_ID, EFF_END_DT, EFF_START_DT, ERR_CD, ERR_FIELD, EXTRACT_DT, MAINT_USERID, P_M__A_R_I_T_A_T, P_M__A_T, P_M__C_P_I_T_A_T, P_M__C_T_H_P_AMT, P_M__E_F_DT, P_M__N_G_A_R__A_T, P_M__N_N_C_P_I_T_A_T, P_M__O_T_F_E_A_T, P_M__P_I_B_L_A_T, P_M__T_P, R_C_RD_T_P, S__ID, S_R_I_E__ID, T_A_S_I__D_, Z_R__P_M__I_D )
 values (:1 , :2 , :3 , :4 , :5 , :6 , :7 , :8 , :9 , :10 , :11 , :12 , :13 , :14 , :15 , :16 , :17 , :18 , :19 , :20 , :21 , :22 , :23 , :24  )
END OF STMT
PARSE #139935830737584:c=0,e=62,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=0,tim=2116016176183
PARSE #139935830738688:c=0,e=14,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=140787661,tim=2116016176703
EXEC #139935830738688:c=0,e=49,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=140787661,tim=2116016176780
FETCH #139935830738688:c=0,e=38,p=0,cr=3,cu=0,mis=0,r=0,dep=1,og=4,plh=140787661,tim=2116016176837
CLOSE #139935830738688:c=0,e=4,dep=1,type=3,tim=2116016176862
* * * * * * * * * * * * * * * * * * 
* * * r=1, indicating only 1 row affected by execution
* * * * * * * * * * * * * * * * * *     
EXEC #139935830737584:c=999,e=1065,p=0,cr=4,cu=5,mis=1,r=1,dep=0,og=1,plh=0,tim=2116016177301
STAT #139935830737584 id=1 cnt=0 pid=0 pos=1 obj=0 op='LOAD TABLE CONVENTIONAL  SATERR_BJ_PYMT_DATA (cr=1 pr=0 pw=0 time=50 us)'
XCTEND rlbk=0, rd_only=0, tim=2116016177362
WAIT #139935830737584: nam='log file sync' ela= 396 buffer#=92400 sync scn=2454467328 p3=0 obj#=-1 tim=2116016177846
WAIT #139935830737584: nam='SQL*Net message to client' ela= 0 driver id=675562835 #bytes=1 p3=0 obj#=-1 tim=2116016177877
WAIT #139935830737584: nam='SQL*Net message from client' ela= 1045 driver id=675562835 #bytes=1 p3=0 obj#=-1 tim=2116016178938
CLOSE #139935830737584:c=0,e=4,dep=0,type=0,tim=2116016178981

這是相同的表,有34行而不是1.它不一致的事實是讓我煩惱的是什么

PARSING IN CURSOR #139935830737584 len=520 dep=0 uid=845 oct=2 lid=845 tim=2116016169849 hv=1904916192 ad='97e96dc98' sqlid='355gqx1sspdr0'
Insert into ___PFT010.S_TE_R_BJ_P_M__D_T_
 (A_A_D_ID, CREATE_DTM, DOC_TXN_ID, EFF_END_DT, EFF_START_DT, ERR_CD, ERR_FIELD, EXTRACT_DT, MAINT_USERID, P_M__A_R_I_T_A_T, P_M__A_T, P_M__C_P_I_T_A_T, P_M__C_T_H_P_AMT, P_M__E_F_DT, P_M__N_G_A_R__A_T, P_M__N_N_C_P_I_T_A_T, P_M__O_T_F_E_A_T, P_M__P_I_B_L_A_T, P_M__T_P, R_C_RD_T_P, S__ID, S_R_I_E__ID, T_A_S_I__D_, Z_R__P_M__I_D )
 values (:1 , :2 , :3 , :4 , :5 , :6 , :7 , :8 , :9 , :10 , :11 , :12 , :13 , :14 , :15 , :16 , :17 , :18 , :19 , :20 , :21 , :22 , :23 , :24  )
END OF STMT
PARSE #139935830737584:c=0,e=326,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=0,tim=2116016169848
PARSE #139935830738688:c=0,e=19,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=140787661,tim=2116016170242
EXEC #139935830738688:c=0,e=59,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=140787661,tim=2116016170329
FETCH #139935830738688:c=0,e=44,p=0,cr=3,cu=0,mis=0,r=0,dep=1,og=4,plh=140787661,tim=2116016170393
CLOSE #139935830738688:c=0,e=3,dep=1,type=3,tim=2116016170421
* * * * * * * * * * * * * * * * * * 
* * * r=34, indicating only 34 row affected by execution. WHAT IS HAPPENING?!?!
* * * * * * * * * * * * * * * * * *     
EXEC #139935830737584:c=5000,e=4592,p=0,cr=11,cu=48,mis=1,r=34,dep=0,og=1,plh=0,tim=2116016174513
STAT #139935830737584 id=1 cnt=0 pid=0 pos=1 obj=0 op='LOAD TABLE CONVENTIONAL  SATERR_BJ_PYMT_DATA (cr=8 pr=0 pw=0 time=3648 us)'
XCTEND rlbk=0, rd_only=0, tim=2116016174622
WAIT #139935830737584: nam='log file sync' ela= 684 buffer#=92313 sync scn=2454467326 p3=0 obj#=-1 tim=2116016175452
WAIT #139935830737584: nam='SQL*Net message to client' ela= 1 driver id=675562835 #bytes=1 p3=0 obj#=-1 tim=2116016175551
WAIT #139935830737584: nam='SQL*Net message from client' ela= 481 driver id=675562835 #bytes=1 p3=0 obj#=-1 tim=2116016176058
CLOSE #139935830737584:c=0,e=6,dep=0,type=0,tim=2116016176107

好的,這是一個有趣的,不幸的是這個答案只解決了99%的問題......

首先,我們通過查看綁定變量確定我們綁定的參數類型正在翻轉,每次發生時,我們都會執行先前的語句並解析一個新的語句(盡管我們只發出一個executeBatch()命令executeBatch() PreparedStatement )。 所以,我們最終在跟蹤日志中看到了這一點:

Row #  Bind :1         Bind :2         Bind :3         Bind :4         Bind :5         
-----  --------------  --------------  --------------  --------------  --------------  
--parse--
    1  VARCHAR2(128)   NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
    2  VARCHAR2(128)   NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
    3  VARCHAR2(128)   NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
    4  VARCHAR2(128)   NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
--execute & parse--
    5  VARCHAR2(128)   VARCHAR2(32)    TIMESTAMP       CLOB            VARCHAR2(2000)
--execute & parse--
    6  VARCHAR2(128)   NUMBER          TIMESTAMP       VARCHAR2(32)    VARCHAR2(2000)
--execute & parse--
    7  VARCHAR2(128)   NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
    8  VARCHAR2(128)   NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
    9  VARCHAR2(128)   NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
   10  VARCHAR2(128)   NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
--execute & parse--
   11  VARCHAR2(2000)  NUMBER          VARCHAR2(32)    CLOB            VARCHAR2(2000)
--execute & parse--
   12  VARCHAR2(2000)  NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
   13  VARCHAR2(2000)  NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
--execute--

通過更多挖掘,我們確定JDBC無法以非空值的方式自動確定null對象的數據類型。 當列是一致的(總是為null或總是填充)時,這不是問題,但是當數據存在可變性時,它是殘酷的。

由於我們是從文件加載的,因此我們沒有源數據類型,但幸運的是我們DID能夠獲取目標數據類型(應該匹配),因此我們能夠指定在我們設置每個參數時PreparedStatement

這一變化帶來了一些重大改進,但我們仍然看到以下情況:

Row #  Bind :1         Bind :2         Bind :3         Bind :4         Bind :5         
-----  --------------  --------------  --------------  --------------  --------------  
--parse--
    1  VARCHAR2(128)   NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
    2  VARCHAR2(128)   NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
    3  VARCHAR2(128)   NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
    4  VARCHAR2(128)   NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
    5  VARCHAR2(128)   NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
--execute & parse--
    6  VARCHAR2(128)   NUMBER          TIMESTAMP       VARCHAR2(32)    VARCHAR2(2000)
--execute & parse--
    7  VARCHAR2(128)   NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
    8  VARCHAR2(128)   NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
    9  VARCHAR2(128)   NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
   10  VARCHAR2(128)   NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
--execute & parse--
   11  VARCHAR2(2000)  NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
   12  VARCHAR2(2000)  NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
   13  VARCHAR2(2000)  NUMBER          TIMESTAMP       CLOB            VARCHAR2(2000)
--execute--

肯定是一個改進,但我們沒有修復CLOB ,我們看到有時擴展VARCHAR2的大小。 經過一番研究后, 由於bind_mismatch ,我們偶然發現了關於高版本計數的這個帖子 ,聽起來很有希望。 我們的數據很好且一致的表格沒有問題,但電子郵件地址等不同長度的字段會對性能造成嚴重破壞。 所以我們運行以下命令強制VARCHAR2綁定到4000的大小:

ALTER SYSTEM SET EVENTS '10503 trace name context forever, level 2001'; 

之后,我們再次嘗試並獲得以下內容:

Row #  Bind :1         Bind :2         Bind :3         Bind :4         Bind :5         
-----  --------------  --------------  --------------  --------------  --------------  
--parse--
    1  VARCHAR2(40000) NUMBER          TIMESTAMP       CLOB            VARCHAR2(4000)
    2  VARCHAR2(40000) NUMBER          TIMESTAMP       CLOB            VARCHAR2(4000)
    3  VARCHAR2(40000) NUMBER          TIMESTAMP       CLOB            VARCHAR2(4000)
    4  VARCHAR2(40000) NUMBER          TIMESTAMP       CLOB            VARCHAR2(4000)
    5  VARCHAR2(40000) NUMBER          TIMESTAMP       CLOB            VARCHAR2(4000)
--execute & parse--                                                             
    6  VARCHAR2(40000) NUMBER          TIMESTAMP       VARCHAR2(32)    VARCHAR2(4000)
--execute & parse--                                                             
    7  VARCHAR2(40000) NUMBER          TIMESTAMP       CLOB            VARCHAR2(4000)
    8  VARCHAR2(40000) NUMBER          TIMESTAMP       CLOB            VARCHAR2(4000)
    9  VARCHAR2(40000) NUMBER          TIMESTAMP       CLOB            VARCHAR2(4000)
   10  VARCHAR2(40000) NUMBER          TIMESTAMP       CLOB            VARCHAR2(4000)
   11  VARCHAR2(40000) NUMBER          TIMESTAMP       CLOB            VARCHAR2(4000)
   12  VARCHAR2(40000) NUMBER          TIMESTAMP       CLOB            VARCHAR2(4000)
   13  VARCHAR2(40000) NUMBER          TIMESTAMP       CLOB            VARCHAR2(4000)
--execute--

我們現在幾乎是完美的,但是當我們得到一個空的CLOB時,我們無法弄清楚如何阻止JDBC綁定VARCHAR2 幸運的是,我們只有幾個具有可空CLOB列的表,因此我們大大提高了性能並減少了更改綁定的影響。 但是我肯定有一部分想要獲得最后1%......任何建議?

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM