简体   繁体   中英

How to query data from multiple Hive tables having a similar naming pattern?

It is my maiden voyage into Hive. I have multiple Hive tables, like snapshots with names as follows:

revenue_20110131
reveue_20110228
revenue_20110331

purchases_qrt1
purchases_qrt2
purchases_qrt3
purchases_qrt4

I have a lot of such snapshot tables. Now, I need to build a script that takes a part of table name as the parameter and reads the records from all such similarly named tables and exports the entire data from all those tables into a single ORC file.

How to do this in Hive? I have no idea where to start as I've never worked on Hive before. Can someone please help me? Thanks in advance, guys.

If the tables have common upper sub-directory in their location, you can create new table using upper directory and select all of them in single select.

create table new tbl 
...
location 'upper common directory path here'

then add these settings before select:

set hive.mapred.supports.subdirectories=TRUE;
set mapred.input.dir.recursive=TRUE;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM