简体   繁体   中英

Hive External Table - Drop Table / Partition and Delete Data

When a hive external table or partition is dropped, it only removes the metadata from hive metastore. The underlying data in HDFS/ Azure storage account are not deleted. What are the options for deleting the data while the table/ partition is dropped?

I have been doing some research and these are my findings

Option 1: Drop the table/ partition & remove corresponding files in HDFS/ Azure Blob storage if using HDInsight.

Option 2: Update hive metastore to make the table property as managed. drop the partition and change back to table property external as below.

ALTER TABLE poc_drop_partition SET TBLPROPERTIES('EXTERNAL'='FALSE') ;
ALTER TABLE poc_drop_partition DROP IF EXISTS PARTITION(partition_date <= '2017-10-11');
ALTER TABLE poc_drop_partition SET TBLPROPERTIES('EXTERNAL'='TRUE') ;

Similarly DROP table statement will drop the table and the underlying data files.

Is there any better ways of doing this. I am aware that there is TRUNCATE functionality in JIRA to be worked on.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM