简体   繁体   English

蜂巢 - 按年分割

[英]Hive - partition by year

I am partitioning by year in hive. 我在蜂巢中划分了一年。 I have created a script: 我创建了一个脚本:

DROP TABLE movies_byYear;

CREATE TABLE movies_byYear (title STRING, full_name STRING, ep_name STRING, type STRING, ep_num STRING, suspended BOOLEAN) PARTITIONED BY (year INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;

INSERT OVERWRITE TABLE movies_byYear PARTITION (year='2013') SELECT title, full_name, ep_name, type, ep_num, suspended FROM movies WHERE year='2013';

However, when using: SELECT COUNT(*) FROM movies WHERE year='2013'; 但是,使用时: SELECT COUNT(*) FROM movies WHERE year='2013';

I do not get all movies by year 2013 back, instead I get all movies back. 我不会在2013年之前收到所有电影,而是让所有电影都回来了。

Is it also possible to let hive decide where to partition? 是否也可以让蜂巢决定在哪里分区?

I really appreciate your answer!!! 我非常感谢你的回答!

UPDATE UPDATE

When adding year I get: 添加year我得到:

INSERT OVERWRITE TABLE movies_byYear PARTITION (year=2013) SELECT title, full_name, ep_name, type, ep_num, suspended, year FROM movies WHERE year=2013;

FAILED: SemanticException [Error 10044]: Line 1:23 Cannot insert into target table because column number/types are different '2013': Table insclause-0 has 6 columns, but query has 7 columns.

When inserting, you insert: 插入时,插入:

SELECT title, full_name, ep_name, type, ep_num, suspended

Add year to that... Currently your year field in movies_byYear is null... 添加年份...目前您在movies_byYear中的year字段为空...

When you specify partition by year in your create table statement in hive, year will be a column in the table!!! 在hive的create table语句中按year指定分区时, year将是表中的一列!!!

UPDATE UPDATE

Replace this 替换它

INSERT OVERWRITE TABLE movies_byYear PARTITION (year='2013') SELECT title, full_name, ep_name, type, ep_num, suspended FROM movies WHERE year='2013';

with this: 有了这个:

INSERT OVERWRITE TABLE movies_byYear PARTITION (year=2013) SELECT title, full_name, ep_name, type, ep_num, suspended FROM movies WHERE year='2013';

Namely, remove the single quotes around the year value in Partition... 即,删除分区中年份值周围的单引号...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM