簡體   English   中英

蜂巢酸更新和刪除錯誤

[英]hive acid update and delete error

我正在使用hive1.2.1和tez0.7進行測試,但是當我使用acid表進行更新和刪除時,發生了一些問題,下面是表結構:

CREATE EXTERNAL TABLE IF NOT EXISTS working.dw_items_w
(
column defination
)
CLUSTERED BY (id) into 5000 buckets
STORED AS ORC
LOCATION '/sys/edw/working/dw_items_w2'
TBLPROPERTIES ("transactional"="true");

和更新查詢如下所示:

update working.dw_items_w
set 
PROCESS_FLAG =(case when (
(TGT_LSTG_STATUS_ID = 1 and (to_date(SALE_END) - to_date(TGT_AUCT_END_DT) ) <> 0 )
or  (TGT_LSTG_STATUS_ID in (1,2) and NEW_LSTG_STATUS_ID in (0,4) )   
) then  'D' 
when 
((TGT_LSTG_STATUS_ID =1 and NEW_LSTG_STATUS_ID = 1 and datediff(to_date(SALE_END) ,to_date(TGT_AUCT_END_DT) 
) = 0 )
or (TGT_LSTG_STATUS_ID = 2 and NEW_LSTG_STATUS_ID = 1)) then 'X' else PROCESS_FLAG end ),
NEW_LSTG_STATUS_ID = (case when TGT_LSTG_STATUS_ID = 0  AND NEW_LSTG_STATUS_ID = 0   AND to_date(SALE_END)
 <  date_sub(to_date( from_unixtime(unix_timestamp(),'yyyy-MM-dd') ), 92)
     AND to_date(SALE_END)  <> to_date('1969-12-31') then 1 else NEW_LSTG_STATUS_ID end) 

where PROCESS_FLAG = 'U';

如下問題:

在org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)的org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)在org.apache.tez.runal.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)處在org.apache.tez.runtime.task.TezTaskRunner $ TaskRunnerCallable $ 1.run(TezTaskRunner.java:179)在org.apache.tez處。 org.apache的javax.security.auth.Subject.doAs(Subject.java:415)處的java.security.AccessController.doPrivileged(本機方法)處的runtime.task.TezTaskRunner $ TaskRunnerCallable $ 1.run(TezTaskRunner.java:171) org.apache.tez.runtime.task.TezTaskRunner $ TaskRunnerCallable.callInternal(TezTaskRunner.java:171)上的.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1650)在org.apache.tez.runtime.task.TezTaskRunner org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)中的$ TaskRunnerCallable.callInternal(TezTaskRunner.java:167)在java.util.concurrent.FutureTask.run(FutureTask.java:26) 2)位於java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)位於java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:615)位於java.lang.Thread.run(Thread.java) :745)由以下原因引起:java.lang.RuntimeException:org.apache.hadoop.hive.ql.metadata.HiveException:Hive運行時在處理行(tag = 0)時發生錯誤{“ key”:{“ reducesinkkey0”:{“ transactionid org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:302)上的“:19,” bucketid“:471,” rowid“:0}},” value“:ignored}}在org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:249)在org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ...另外14個

將以下內容添加到hive-site.xml

<property>
    <name>hive.enforce.bucketing</name>
    <value>true</value>
</property>
<property>
    <name>hive.compactor.initiator.on</name>
    <value>true</value>
</property>
<property>
    <name>hive.support.concurrency</name>
    <value>true</value>
</property>
<property>
    <name>hive.compactor.worker.threads</name>
    <value>1</value>
</property>
<property>
    <name>hive.txn.manager</name>
    <value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value>
</property>

然后確保您要創建一個帶有預測的存儲桶的ORC表:

如果不存在則創建表foo.tableinfo(schema_name varchar(32),table_name varchar(64),department varchar(64),country varchar(64),state varchar(64),city varchar(64),粒度int,注釋varchar (256))由(table_name)聚集成4個存儲在ORC TBLPROPERTIES中的存儲桶(“ orc.compress” =“ ZLIB”,'transactional'='true');

然后,以下將起作用:

從foo.tableinfo中刪除,其中table_name ='foo';

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM