简体   繁体   中英

How to rename partition value in Hive?

I have a hive table 'videotracking_playevent' which uses the following partition format (all strings): source/createyear/createmonth/createday. Example: source=home/createyear=2016/createmonth=9/createday=1

I'm trying to update the partition values of createmonth and createday to consistently use double digits instead. Example: source=home/createyear=2016/createmonth=09/createday=01

I've tried to the following query:

ALTER TABLE videotracking_playevent PARTITION (
source='home', 
createyear='2015', 
createmonth='11', 
createday='1'
) RENAME TO PARTITION (
source='home', 
createyear='2015', 
createmonth='11', 
createday='01'
);

However that returns the following, non-descriptive error from hive: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. null Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. null

I've confirmed that this partition exists, and I think I'm using the correct syntax. My hive version is Hive 1.1.0 Any ideas what I might be doing wrong?

There was an issue with old version of Hive with renaming partition. This might be an issue for your case too. Please see this link for detail.

You need to set below two property before executing the rename partition command if you are using Older version of Hive.

set fs.hdfs.impl.disable.cache=false; 
set fs.file.impl.disable.cache=false; 

Now run the query by setting this property.

hive> set fs.hdfs.impl.disable.cache=false;
hive> set  fs.file.impl.disable.cache=false;
hive> ALTER TABLE partition_test PARTITION (year='2016',day='1') RENAME TO PARTITION (year='2016',day='01');
OK
Time taken: 0.28 seconds
hive> show partitions partition_test;
OK
year=2016/day=01
Time taken: 0.091 seconds, Fetched: 1 row(s)
hive>

This issue is fixed in Hive latest version. In my case Hive version is 1.2.1 and it works, without setting that property. Please see the example below.

Create a partitioned table.

hive> create table partition_test(
    > name string,
    > age int)
    > partitioned by (year string, day string);
OK
Time taken: 5.35 seconds
hive> 

Now add the partition and check the newly added partition.

hive> alter table partition_test ADD PARTITION (year='2016', day='1');
OK
Time taken: 0.137 seconds
hive>


hive> show partitions partition_test;
OK
year=2016/day=1
Time taken: 0.169 seconds, Fetched: 1 row(s)
hive>

Rename the partition using RENAME TO PARTITION command and check it.

hive> ALTER TABLE partition_test PARTITION (year='2016',day='1') RENAME TO PARTITION (year='2016',day='01');
OK
Time taken: 0.28 seconds
hive> show partitions partition_test;
OK
year=2016/day=01
Time taken: 0.091 seconds, Fetched: 1 row(s)
hive>

Hope it helps you.

Rename lets you change the value of a partition column. One of use cases is that you can use this statement to normalize your legacy partition column value to conform to its type. In this case, the type conversion and normalization are not enabled for the column values in old partition_spec even with property hive.typecheck.on.insert set to true (default) which allows you to specify any legacy data in form of string in the old partition_spec"

Bug open https://issues.apache.org/jira/browse/HIVE-10362

You can create a copy of the table without partition, then update the column of the table, and then recreate the first one with partition

create table table_name partitioned by (table_column) as 
select
   * 
from
   source_table

That worked for me.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM