简体   繁体   中英

Partitions by Range with two columns in Kudu table

I'm trying to create a kudu table partitioned by hash and by range with 2 variables (year, month), My problem is that I want to make biannual range partitions, without add more columns in the table.

In the link:

https://kudu.apache.org/docs/kudu_impala_integration.html

In section Specifying Tablet Partitioning

They propouse range partition by one column, but below we can read:

If you have multiple primary key columns, you can specify partition bounds using tuple syntax: ('va',1), ('ab',2). The expression must be valid JSON.

Then, I tried with the next query:

CREATE TABLE pruebas.partwithrang ( 
   year int COMMENT 'año', 
   month int COMMENT 'mes', 
   day int COMMENT 'dia', 
   id string COMMENT 'id',   
   name string COMMENT 'nombre',       
   PRIMARY KEY (year, month, day, id))       
   PARTITION BY HASH (id) PARTITIONS 3,                              
   RANGE(year, month) ( 
   PARTITION (2020, 1) <= VALUES <= (2020, 6), 
   PARTITION (2020, 7) <= VALUES <= (2020, 12), 
   PARTITION (2021, 1) <= VALUES <= (2021, 6), 
   PARTITION (2021, 7) <= VALUES <= (2021, 12) 
) 
COMMENT "Probando particion por rango" 
STORED AS KUDU tblproperties ('kudu.master_addresses'='localhost:7051', 'kudu.num_tablet_replicas'='1')

But I recieved next error:

ERROR: AnalysisException: Syntax error in line 10:
PARTITION (2020, 1) <= VALUES <= (2020, 7),
' ^
Encountered: COMMA
Expected: AND, BETWEEN, DIV, ILIKE, IN, IREGEXP, IS, LIKE, NOT, OR, REGEXP, RLIKE CAUSED BY: Exception: Syntax error

I didn't found any information about this kind of range partitions. Could you help me please?

I review too this link: https://docs.cloudera.com/documentation/enterprise/5-12-x/topics/impala_create_table.html

In section kudu_partition_clause

I'm not sure but I understand that just this 2 ways are possible to define a range partition:

PARTITION constant_expression range_comparison_operator VALUES range_comparison_operator constant_expression | PARTITION VALUE = constant_expression_or_tuple

Then, it is possible the way of partitioning that I'm suggesting?

Thanks you!!!

The problem is there are a bug in impala https://issues.apache.org/jira/browse/IMPALA-6929 and you can't use '<=' or '<'

A workaround could be using the '=' instead of '<=' or '<', for example: PARTITION VALUE = (2020, 12),

In CDH 6.x the problems is fix it (Apache Impala 3.0.0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM