简体   繁体   English

配置单元表已管理或外部-发布表类型转换

[英]Hive Table is MANAGED or EXTERNAL - issue post table type conversion

I have a hive table in XYZ db named ABC . 我在XYZ数据库中有一个名为ABC的配置单元表。

When I run describe formatted XYZ.ABC; 当我运行describe formatted XYZ.ABC; from hue, I get the following.. 从色相中,我得到以下信息。

作为外部显示的管理表

that is 那是

Table Type: MANAGED_TABLE
Table Parameters: EXTERNAL True

So is this actually an external or a managed/internal hive table? 那么,这实际上是外部还是托管/内部配置单元表?

This is treated as an EXTERNAL table. 这被视为EXTERNAL表。 Dropping table will keep the underlying HDFS data. 删除表将保留基础HDFS数据。 The table type is being shown as MANAGED_TABLE since the parameter EXTERNAL is set to True , instead of TRUE . 由于参数EXTERNAL设置为True而不是TRUE ,因此表类型显示为MANAGED_TABLE

To fix this metadata, you can run this query: 要修复此元数据,您可以运行以下查询:

hive> ALTER TABLE XYZ.ABC SET TBLPROPERTIES('EXTERNAL'='TRUE');

Some details: 一些细节:

The table XYZ.ABC must have been created via this kind of query: XYZ.ABC必须通过以下查询创建:

hive> CREATE TABLE XYZ.ABC
<additional table definition details>
TBLPROPERTIES (
  'EXTERNAL'='True');

Describing this table will give: 描述此表将得到:

hive> desc formatted XYZ.ABC;
:
Location:               hdfs://<location_of_data>
Table Type:             MANAGED_TABLE
:
Table Parameters:
  EXTERNAL              True

Dropping this table will keep the data referenced in Location in describe output. 删除此表将保留describe输出中Location中引用的数据。

 hive> drop table XYZ.ABC;
 # does not drop table data in HDFS

The Table Type still shows as MANAGED_TABLE which is confusing. Table Type仍然显示为MANAGED_TABLE ,这令人困惑。

Making the value for EXTERNAL as TRUE will fix this. EXTERNAL的值设为TRUE将解决此问题。

hive> ALTER TABLE XYZ.ABC SET TBLPROPERTIES('EXTERNAL'='TRUE');

Now, doing a describe will show it as expected: 现在,进行describe将按预期显示它:

hive> desc formatted XYZ.ABC;
:
Location:               hdfs://<location_of_data>
Table Type:             EXTERNAL_TABLE
:
Table Parameters:
    EXTERNAL                TRUE

Example - 范例-

Lets create a sample MANAGED table , 让我们创建一个示例MANAGED表

CREATE TABLE TEST_TBL(abc int, xyz string);
INSERT INTO TABLE test_tbl values(1, 'abc'),(2, 'xyz');
DESCRIBE FORMATTED test_tbl;

MANAGED_TABLE描述表格图片

Changing type to EXTERNAL (in the wrong way using True , instead of TRUE ) : 将类型更改为EXTERNAL (使用True而不是TRUE以错误的方式)

ALTER TABLE test_tbl SET TBLPROPERTIES('EXTERNAL'='True');

This gives, 这样, 外部表错误显示的图像

Now lets DROP the table , DROP TABLE test_tbl; 现在让DROP table ,DROP TABLE test_tbl;

The result: 结果:

Table is dropped but data on HDFS isn't. 表已删除,但HDFS上的数据未删除。 Showing correct external table behavior! 显示正确的外部表行为!

If we re-create the table we can see data exists: 如果重新创建表,我们可以看到数据存在:

CREATE TABLE test_tbl(abc int, xyz string);
SELECT * FROM test_tbl;

Result: 结果: 选择*的输出

The describe shows it wrongly as MANAGED TABLE along with EXTERNAL True because of: 该描述将其与EXTERNAL True一起错误地显示为MANAGED TABLE ,原因是:

.equals check in the meta .equals签入中继

Hive Issue JIRA: HIVE-20057 蜂巢问题JIRA: HIVE-20057

Proposed fix: Use case insensitive equals 建议的解决方案: 用例不区分大小写

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM