ORC fileformat with Impala

Question

Can ORC fileformat be used in Impala? Also how to access ORC table stored in hive metastore in Impala. Found below documentation link, but it doesn't contain any restricted fileformats list or mention of ORC not supported with impala: http://www.cloudera.com/documentation/enterprise/latest/topics/impala_file_formats.html

Answer 1

ORC is not supported in Impala. Rather, Apache Parquet is the recommend format for best performance.

Answer 2

Impala cannot read ORC file format. If you have the possibility, I would suggest to migrate your ORC files to PARQUET with Hive. The advantage is that you are paying just one the time of setting up map-reduce tasks.

If your ORC table is nameoforctable, the a very basic query looks like:

CREATE TABLE nameoforctable_parquet
LIKE nameoforctable
STORED AS PARQUET
LOCATION '/your/hdfs/location';

INSERT INTO nameoforctable_parquet 
SELECT * FROM nameoforctable

Answer 3

Even though ORC is the only format to support ACID feature in Hive and demonstrated better query performance and compression ratio in some benchmarking studies, Impala doesn't support the ORC file format because it was created by Hortonworks, who is one of their major competitors. Vice versa, the Hive version on Hortonworks Data Platform (HDP) does not support Parquet for the same reason.

Answer 4

使用follow命令在impala中创建orc格式表：

create table orc_table_name_1 (x INT, y STRING) STORED AS orc;

ORC fileformat with Impala

Question

4 answers

solution1
3 ACCPTED 2016-05-11 17:44:55

solution2
0 2016-06-09 22:37:44

solution3
0 2017-04-08 04:02:43

solution4
0 2019-01-10 09:59:37

ORC fileformat with Impala

Question

4 answers

solution1 3 ACCPTED 2016-05-11 17:44:55

solution2 0 2016-06-09 22:37:44

solution3 0 2017-04-08 04:02:43

solution4 0 2019-01-10 09:59:37

solution1
3 ACCPTED 2016-05-11 17:44:55

solution2
0 2016-06-09 22:37:44

solution3
0 2017-04-08 04:02:43

solution4
0 2019-01-10 09:59:37