[英]Hive AvroSerde on cluster exception
我有AVRO文件,我需要將該文件映射到HIVE表。 最好的解決方案是使用AvroSerDe。 所以我在集群上使用了下一個命令:
- CREATE EXTERNAL TABLE my_db.new_table
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
TBLPROPERTIES (
'avro.schema.url'='hdfs:///folder/mySchema.avsc');
- LOAD DATA inpath '/folder/myFile.avro' OVERWRITE INTO TABLE my_db.new_table;
所有這些命令均已成功執行,但是當我嘗試使用配置單元查詢語言獲取數據時,在Hadoop map任務上卻有例外:
SELECT
user.name as u_name,
FROM my_db.new_table
LATERAL VIEW explode(users) user_table as user;
例外:
2015-05-27 13:22:24,838 DEBUG [main] org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils: Failed to open file system for uri hdfs:///folder/mySchema.avsc assuming it is not a FileSystem url
java.io.IOException: Incomplete HDFS URI, no host: hdfs:///folder/mySchema.avsc
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:142)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.getSchemaFromFS(AvroSerdeUtils.java:149)
at org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:110)
at org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.getSchema(AvroGenericRecordReader.java:112)
at org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.<init>(AvroGenericRecordReader.java:70)
at org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:65)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:298)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.<init>(HadoopShimsSecure.java:259)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:386)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:652)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:169)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
蜂巢版本:0.14
這種例外的原因是什么?
謝謝!
問題出在
TBLPROPERTIES (
'avro.schema.url'='hdfs:///folder/mySchema.avsc');
avro.schema.url需要在URL中包含MASTER_NODE_NAME +端口。 所以正確的版本是:
TBLPROPERTIES (
'avro.schema.url'='hdfs://MASTER_NODE_NAME:port/folder/mySchema.avsc');
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.