简体   繁体   中英

Hive External Table - Where is data location meta data stored?

I am using Hive external tables on Amazon EMR. Often these tables are partitioned, with each partition pointing to a different bucket in S3. I am using MySQL for Hive meta data storage.

I want to be able to see the location/bucket on S3 that each partition is pointing to. I have looked into the meta data tables in MySQL. I can see partition information there, but nothing that indicates that actual location of the data.

Is this data available in MySQL, or can it be obtained by Hive commands?

The following hive command can be used to get the location

hive> show create table <TableName>;

Please search for the line LOCATION in the output of the above hive command.

For an external partitioned table, each partition has a location, rather than the table itself having a location. You need to use something like

show partitions employees

to get the partition list then

describe extended employees partition (year='2016', month='05', day='25')

to see the location of a particular partition.

Other commands like show create table employees may not give useful info about the data location:

LOCATION 'hdfs://nameservice1/user/hive/warehouse/something.db/employees'

describe extended table_name

Will provide you all details about the tables including (tableName:ca_data, dbName:suman, owner:suman, createTime:1494368591, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:) and many more.

Another Command:

desc formatted table_name;

If you want to see the actual data storage location of hive table,you can use multiple ways .

1) hive> show create table <TableName>; It will provide you the table syntax long with where actual data located path .

2) describe extended table_name or describe formatted table_name . It will give you the location,owner,comments,table type etc details .

3) The above formats will help you only if you want to check the location of single table but the above steps won't help if you want to check the location of multiple tables in multiple databases .

So here we can you hive metastore and get the locations of multiple tables with a single query .

I saw a very good article about how to get the location of all hive tables HDFS path, please read it . https://askdoubts.com/question/how-to-find-out-list-of-all-hive-external-tables-and-hdfs-paths-from-hive-metastore/#comment-19

Thanks, Mahesh

As h4ck3r mentioned, you could use the "Show create table" command to look for location information.

To see partition specific information, use Show Table/Partition Extended :

SHOW TABLE EXTENDED will list information for all tables matching the given regular expression. Users cannot use regular expression for table name if a partition specification is present. This command's output includes basic table information and file system information like totalNumberFiles, totalFileSize, maxFileSize, minFileSize,lastAccessTime, and lastUpdateTime. If partition is present, it will output the given partition's file system information instead of table's file system information.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM