简体   繁体   English

我需要从分区表 (Hive) 进行备份

[英]I need to make backup from partitioned table (Hive)

I need to make backup data from partitioned table which has over 500 partitions.我需要从超过 500 个分区的分区表中备份数据。 My table has partitioning by date_part like "date_part = 20221101" or "date_part = 20221102" etc. I need to take 30 partitions from 20221101 to 20221130 and make copy to another new backup-table.我的表按 date_part 分区,如“date_part = 20221101”或“date_part = 20221102”等。我需要从 20221101 到 20221130 进行 30 个分区,并复制到另一个新的备份表。

If I do something like this:如果我这样做:

create table <backup_table> as
select * from <data_table> where date_part between 20221101 and 20221130

at the output I get non-partitioned <backup_table> and idk is it good way or not but i guess partitioned <backup_table> will be more better.在 output,我得到未分区的 <backup_table> 和 idk 是不是好方法,但我想分区的 <backup_table> 会更好。

If I try to do:如果我尝试这样做:

create table <bacup_table> like <data_table>;
insert overwrite table <backup_table> partition (`date_part`)
select * from <data_table> where date_part between 20221101 and 20221130;

At the output I get error like need to specify partition columns...在 output,我收到类似需要指定分区列的错误...

If I go another way:如果我 go 另一种方式:

create table <bacup_table> like <data_table>;
insert overwrite table <backup_table> partition (`date_part`)
select field1, field2...,
date_part
from <data_table> where date_part between 20221101 and 20221130;

I get another errors like "error running query" or "...nonstrick mode..." or something else.我收到另一个错误,例如“运行查询时出错”或“...nonstrick 模式...”或其他错误。 I've tried a lot of hive settings but it still not work:(我已经尝试了很多 hive 设置,但它仍然不起作用:(

Thats why I need your help to do it correctly.这就是为什么我需要你的帮助才能正确地做到这一点。

enable dynamic partition and copy the data.启用动态分区并复制数据。

SET hive.exec.dynamic.partition = true;
SET hive.exec.dynamic.partition.mode = nonstrict;
SET hive.mapred.mode = nonstrict;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Hive:如何将数据从分区表插入分区表? - Hive: How do I INSERT data FROM a PARTITIONED table INTO a PARTITIONED table? 蜂巢:由于目标表已分区,因此需要指定分区列 - Hive: Need to specify partition columns because the destination table is partitioned 有没有一种简单的方法可以从Hive中的托管表创建分区表? - Is there an easy way to create a partitioned table from a managed table in Hive? 如何从按日期列分区的 hive 表中获取最新日期? - How to fetch latest date from a hive table partitioned on date column? SQL - 在根据另一个表中的值检查分区字段时,我可以使用分区吗? - SQL - Can I make use of a partition when checking the partitioned field against value from another table? 如何在Hive上对未分区的表进行分区? - How to partition a non-partitioned table on Hive? 从现有分区表创建分区表 - Create a partitioned table from existing partitioned table Hive 不支持子查询,我需要从自连接表中获取最大日期 - Hive is not supporting subquery, I need to get max date from self join table Hive无法在HBase中为外部表创建分区列 - Hive can't create partitioned column for external table in hbase 将数据从一个分区表复制到另一个新的分区表 - copy data from one partitioned table to another new partitioned table
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM