I have the following table:
CREATE EXTERNAL TABLE aggregate_status(
m_point VARCHAR(50),
territory VARCHAR(50),
reading_meter VARCHAR(50),
meter_type VARCHAR(500)
)
PARTITIONED BY(
insert_date VARCHAR(10))
STORED AS PARQUET
LOCATION '<the s3 route>/aggregate_status'
TBLPROPERTIES(
'parquet.compression'='SNAPPY'
)
I wish to change the reading_meter
column to reading_mode
, without losing data.
ALTER TABLE
works, but the field now shows null
.
I'm not the owner of the Hadoop enviroment I'm working on so changing properties such as set parquet.column.index.access = true
is discarded.
Any help would be appreciated. Thanks.
Managed to find a solution, at least for short amounts of data.
CREATE TABLE aggregate_status_bkp AS
SELECT
m_point,
territory,
reading_meter AS reading_mode,
meter_type,
insert_date
FROM aggregate_status
ALTER TABLE aggregate_status CHANGE COLUMN reading_meter reading_mode VARCHAR (50)
--You might need to temporarily disable strict partition mode depending on your case, this is safe since it's only a lock.
--set hive.exec.dynamic.partition.mode=nonstrict;
INSERT OVERWRITE TABLE aggregate_status PARTITION(insert_date)
SELECT
m_point,
territory,
reading_mode,
meter_type,
insert_date
FROM aggregate_status_bkp;
--set hive.exec.dynamic.partition.mode=strict;
Another situation we want to protect against dynamic partition insert is that the user may accidentally specify all partitions to be dynamic partitions without specifying one static partition, while the original intention is to just overwrite the sub-partitions of one root partition. We define another parameter hive.exec.dynamic.partition.mode=strict to prevent the all-dynamic partition case.
See https://cwiki.apache.org/confluence/display/Hive/Tutorial#Tutorial-QueryingandInsertingData
DROP TABLE aggregate_status_bkp;
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.