简体   繁体   English

无法将数据从Apache配置单元加载到ElasticSearch-

[英]Unable to load data from Apache hive to ElasticSearch -

I am using CDH5.5,ElasticSearch-2.4.1. 我正在使用CDH5.5,ElasticSearch-2.4.1。 I have created Hive table and trying to push the hive table data to ElasticSearch using the below query. 我已经创建了Hive表,并尝试使用以下查询将Hive表数据推入ElasticSearch。

CREATE EXTERNAL TABLE test1_es(
  id string,
  timestamp string, 
  dept string)<br>
ROW FORMAT SERDE 'org.elasticsearch.hadoop.hive.EsSerDe'  
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'  
LOCATION
  'hdfs://quickstart.cloudera:8020/user/cloudera/elasticsearch/test1_es'
TBLPROPERTIES (  'es.nodes'='localhost', 
'es.resource'='sample/test1',
'es.mapping.names' = 'timestamp:@timestamp',
'es.port' = '9200', 
'es.input.json' = 'false', 
'es.write.operation' = 'index', 
'es.index.auto.create' = 'yes'
);<br>
INSERT INTO TABLE default.test1_es select id,timestamp,dept from test1_hive;

I'm getting the below error in the Job Tracker URL " 我在“作业跟踪器网址”中遇到以下错误

 Failed while trying to construct the redirect url to the log server. Log Server url may not be configured. <br>
java.lang.Exception: Unknown container. Container either has not started or has already completed or doesn't belong to this node at all. "

It will throw "FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask" in hive terminal. 它将在hive终端中抛出"FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask"

I tried all the steps mentioned in forums like including /usr/lib/hive/bin/elasticsearch-hadoop-2.0.2.jar in hive-site.xml, adding ES-hadoop jar to HIVEAUXJARS_PATH, copied yarn jar to /usr/lib/hadoop/elasticsearch-yarn-2.1.0.Beta3.jar also. 我尝试了论坛中提到的所有步骤,例如在hive-site.xml中包括/usr/lib/hive/bin/elasticsearch-hadoop-2.0.2.jar,将ES-hadoop jar添加到HIVEAUXJARS_PATH,将纱罐复制到/ usr / lib / hadoop / elasticsearch-yarn-2.1.0.Beta3.jar也。 Please suggest me how to fix the error. 请建议我如何解决该错误。


Thanks in Advance, Sreenath 在此先感谢,Sreenath

I'm dealing with the same problem, and I found the execution error thrown by hive is caused by a timestamp field of string type which could not be parsed. 我正在处理相同的问题,我发现配置单元抛出的执行错误是由无法解析的字符串类型的时间戳字段引起的。 I'm wondering whether timestamp fields of string type could be properly mapped to es, and if not this could be the root cause. 我想知道是否可以将字符串类型的时间戳字段正确映射到es,如果不是,这可能是根本原因。

BTW, you should go to the hadoop MR log to find more details about the error. 顺便说一句,您应该转到hadoop MR日志中以找到有关该错误的更多详细信息。

REATE EXTERNAL TABLE test1_es(
  id string,
  timestamp string, 
  dept string)<br>
ROW FORMAT SERDE 'org.elasticsearch.hadoop.hive.EsSerDe'  
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'  
TBLPROPERTIES ...........

don't need location 不需要位置

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM