简体   繁体   中英

Error while exchanging partition in hive tables

I am trying to merge the incremental data with an existing hive table.

For testing I created a dummy table from the base table as below:

create base.dummytable like base.fact_table

The table: base.fact_table is partition based on dbsource String When I checked the dummy table's DDL, I could see that the partition column is correctly defined.

PARTITIONED BY (                                                 |
|   `dbsource` string)

Then I tried to exchange one of the partition from the dummy table by dropping it first.

spark.sql("alter table base.dummy drop partition(dbsource='NEO4J')")

The partition: NEO4J has dropped successfully and I ran the exchange statement as below:

spark.sql("ALTER TABLE base.dummy EXCHANGE PARTITION (dbsource = 'NEO4J') WITH TABLE stg.inc_labels_neo4jdata")

The exchange statement is giving an error:

Error: Error while compiling statement: FAILED: ValidationFailureSemanticException table is not partitioned but partition spec exists: {dbsource=NEO4J}

The table I am trying to push the incremental data is partitioned by dbsource and I have dropped it successfully. I am running this from spark code and the config is given below:

  val conf = new SparkConf().setAppName("MERGER").set("spark.executor.heartbeatInterval", "120s")
      .set("spark.network.timeout", "12000s")
      .set("spark.sql.inMemoryColumnarStorage.compressed", "true")
      .set("spark.shuffle.compress", "true")
      .set("spark.shuffle.spill.compress", "true")
      .set("spark.sql.orc.filterPushdown", "true")
      .set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
      .set("spark.kryoserializer.buffer.max", "512m")
      .set("spark.serializer", classOf[org.apache.spark.serializer.KryoSerializer].getName)
      .set("spark.streaming.stopGracefullyOnShutdown", "true")
      .set("spark.dynamicAllocation.enabled", "false")
      .set("spark.shuffle.service.enabled", "true")
      .set("spark.executor.instances", "4")
      .set("spark.executor.memory", "4g")
      .set("spark.executor.cores", "5")
      .set("hive.merge.sparkfiles","true")
      .set("hive.merge.mapfiles","true")
      .set("hive.merge.mapredfiles","true")

show create table base.dummy:

CREATE TABLE `base`.`dummy`(
`dff_id` bigint, 
`dff_context_id` bigint,  
`descriptive_flexfield_name` string,  
`model_table_name` string)
 PARTITIONED BY (`dbsource` string)
  ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
 STORED AS INPUTFORMAT 
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
 OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
 LOCATION  
'/apps/hive/warehouse/base.db/dummy'
 TBLPROPERTIES ( 
'orc.compress'='ZLIB')

show create table stg.inc_labels_neo4jdata:

CREATE TABLE `stg`.`inc_labels_neo4jdata`(
`dff_id` bigint, 
`dff_context_id` bigint,  
`descriptive_flexfield_name` string,  
`model_table_name` string)
`dbsource` string)
  ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
 STORED AS INPUTFORMAT 
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
 OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
 LOCATION  
'/apps/hive/warehouse/stg.db/inc_labels_neo4jdata'
 TBLPROPERTIES ( 
'orc.compress'='ZLIB')

Could anyone let me know what the mistake I am doing here & what should I change inorder to successfully exchange the partition ?

My take on this error is that table stg.inc_labels_neo4jdata is not partitioned as base.dummy and therefore there's no partition to move.

From Hive documentation :

This statement lets you move the data in a partition from a table to another table that has the same schema and does not already have that partition.

You can check the Hive DDL Manual for EXCHANGE PARTITION

And the JIRA where this feature was added to Hive. You can read:

This only works if and have the same field schemas and the same partition by parameters. If they do not the command will throw an exception.

You basically need to have exactly the same schema on both source_table and destination_table .

Per your last edit, this is not the case.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM