简体   繁体   English

如何在Hive上对未分区的表进行分区?

[英]How to partition a non-partitioned table on Hive?

Given a table with 360 days of data, we want to partition it by date to improve performance. 给定一个具有360天数据的表,我们希望按日期对它进行分区以提高性能。 Do we need to use following SELECT command for each date? 我们是否需要为每个日期使用以下SELECT命令? Any more efficient way to do this? 还有更有效的方法吗?

INSERT INTO TABLE <new_table> Partition (dt='2015-07-01')
SELECT * from <table> WHERE dt='2015-07-01'

If your new table is partitioned by dt (date), you should use Dynamic Partition . 如果您的新表按dt(日期)进行了分区 ,则应使用动态分区 You dont need to specify the specific partition (in this case date). 您无需指定特定的分区(在这种情况下为日期)。 In this way Hive realize all different dates and it makes the partitions automatically. 通过这种方式,Hive可以实现所有不同的日期,并自动创建分区。

Remember set these flags: 记住设置以下标志:

set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;

First make your table: 首先使您的表:

create  db.my_table(column1 int, column2 string,
                     -- ...
)
comment 'I like paritioned tables'
partitioned by(dt string)
location '/path/to/file';

Now you can load the data into dt partitions: 现在您可以将数据加载到dt分区中:

insert overwrite into table db.my_table partition (dt) select * from other_table;

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在 ORACLE 中交换分区。 ORA-14095: ALTER TABLE EXCHANGE 需要非分区、非聚集表 - How I can to exchange partition in ORACLE. ORA-14095: ALTER TABLE EXCHANGE requires a non-partitioned, non-clustered table 我可以用函数 mr 返回一个非分区表吗? - Can I return a non-partitioned table with function mr? Oracle SQL - 使用 expdp/impdp 将数据从非分区(常规表)导入空分区表 - Oracle SQL - Import data from non-partitioned (regular table) into an empty partitioned table using expdp/impdp 对分区表的Postgres查询比非分区表慢2倍 - Postgres query on partitioned table 2x slower than non-partitioned table 蜂巢:由于目标表已分区,因此需要指定分区列 - Hive: Need to specify partition columns because the destination table is partitioned 对Google bigquery中的非分区数据进行计数 - Doing a running count on non-partitioned data in Google bigquery 非分区表上的分区索引 - partitioned index on a non partitioned table 当表按天分区时,BigQuery 如何按月/年查询分区? - BigQuery how to query partition by month/year when table partitioned by day? 进入 Hive 表 - 非分区表到具有多个分区的分区表 - 由于列号/类型,无法插入目标表 - into Hive table - Non Partitioned table to Partitioned table having multiple partitions - Cannot insert into target table because column number/types Hive:如何将数据从分区表插入分区表? - Hive: How do I INSERT data FROM a PARTITIONED table INTO a PARTITIONED table?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM