[英]Hive Execution Error
I am new to avro and hive and while learning it i got some confusion. 我是Avro和Hive的新手,在学习它的同时,我感到有些困惑。 Using
使用
tblproperties('avro.schema.url'='somewhereinHDFS/categories.avsc')
. tblproperties('avro.schema.url'='somewhereinHDFS/categories.avsc')
。
If I run this create
command like 如果我像这样运行此
create
命令
create table categories (id Int , dep_Id Int , name String)
stored as avrofile
tblproperties('avro.schema.url'=
'hdfs://quickstart.cloudera/user/cloudera/data/retail_avro_avsc/categories.avsc')
but why should i use id Int, dep_Id Int
in above command even if i am giving avsc
file which contains complete schema. 但是
id Int, dep_Id Int
即使我要提供包含完整架构的avsc
文件,为什么我也应该在上述命令中使用id Int, dep_Id Int
。
create table categories stored as avrofile
tblproperties('avro/schema.url'=
'hdfs://quickstart.cloudera/user/cloudera/data/retail_avro_avsc/categories.avsc')
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException
Encountered AvroSerdeException determining schema.
Returning signal schema to indicate problem:
Neither avro.schema.literal nor avro.schema.url specified,
can't determine table schema)
Why does hive need to specify the schema even if the avsc
file is present and it already contains the schema? 为什么即使存在
avsc
文件并且它已经包含模式,配置单元也需要指定模式?
Can you try to do it in this way? 您可以尝试通过这种方式吗?
CREATE TABLE categories
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
TBLPROPERTIES (
'avro.schema.url'='http://schema.avsc');
More info here https://cwiki.apache.org/confluence/display/Hive/AvroSerDe 更多信息在这里https://cwiki.apache.org/confluence/display/Hive/AvroSerDe
Creating an external hive table orders_sqoop
from given avro-schema file and avro-data file: 从给定的avro模式文件和avro数据文件创建外部配置单元表
orders_sqoop
:
hive> create external table if not exists orders_sqoop
stored as avro
location '/user/hive/warehouse/retail_stage.db/orders'
tblproperties('avro.schema.url'='/user/hive/warehouse/retail_stage.db/orders_schema/orders.avsc');
The above create table
command executes successfully and creates orders_sqoop
table. 上面的
create table
命令成功执行并创建orders_sqoop
表。
Validate the table structure below: 验证下面的表结构:
hive> show create table orders_sqoop;
OK
CREATE EXTERNAL TABLE `orders_sqoop`(
`order_id` int COMMENT '',
`order_date` bigint COMMENT '',
`order_customer_id` int COMMENT '',
`order_status` string COMMENT '')
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION
'hdfs://quickstart.cloudera:8020/user/hive/warehouse/retail_stage.db/orders'
TBLPROPERTIES (
'COLUMN_STATS_ACCURATE'='false',
'avro.schema.url'='/user/hive/warehouse/retail_stage.db/orders_schema/orders.avsc',
'numFiles'='2',
'numRows'='-1',
'rawDataSize'='-1',
'totalSize'='660906',
'transient_lastDdlTime'='1563093902')
Time taken: 0.125 seconds, Fetched: 21 row(s)
The above table created as expected. 上表已按预期创建。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.