简体   繁体   English

Json 的 Hive 查询错误

[英]Hive query error with Json

I'm creating a table using the twitter cloudera example, while I've successfully created the table and get the data, I'm encountering a problem.我正在使用 twitter cloudera 示例创建一个表,当我成功创建表并获取数据时,我遇到了一个问题。

I can perform a select * from tweets;我可以select * from tweets; and returned the data, but when I go more deeper like a count(*) I'm receiving an error.并返回数据,但是当我像count(*)一样更深入时,我收到一个错误。

Here's the table created:这是创建的表:

ADD JAR /cdh-twitter-example/hive-serdes/target/hive-serdes-1.0-SNAPSHOT.jar;添加 JAR /cdh-twitter-example/hive-serdes/target/hive-serdes-1.0-SNAPSHOT.jar; CREATE EXTERNAL TABLE tweets ( id BIGINT, created_at STRING, CREATE EXTERNAL TABLE tweets ( id BIGINT, created_at STRING,
source STRING, favorited BOOLEAN, retweet_count INT,源字符串,最喜欢的 BOOLEAN,retweet_count INT,
retweeted_status STRUCT< text:STRING, user:STRUCT>, entities STRUCT< urls:ARRAY>, user_mentions:ARRAY>, hashtags:ARRAY>>, text STRING, user STRUCT< screen_name:STRING, name:STRING, friends_count:INT, followers_count:INT, statuses_count:INT, verified:BOOLEAN, utc_offset:INT, time_zone:STRING>, in_reply_to_screen_name STRING ) ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe' LOCATION '/user/flume/tweets'; retweeted_status STRUCT< text:STRING, user:STRUCT>, entity STRUCT< urls:ARRAY>, user_mentions:ARRAY>, hashtags:ARRAY>>, text STRING, user STRUCT< screen_name:STRING, name:STRING,friends_count:INT, follower_count :INT, statuses_count:INT, 验证:BOOLEAN, utc_offset:INT, time_zone:STRING>, in_reply_to_screen_name STRING ) ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe' LOCATION '/user/flume/tweets';

Here's the error and stack trace:这是错误和堆栈跟踪:

hive> select count(*) from tweets; hive> 从推文中选择 count(*); Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer= In order to limit the maximum number of reducers: set hive.exec.reducers.max= In order to set a constant number of reducers: set mapred.reduce.tasks= Starting Job = job_1402410026954_0004, Tracking URL = http://bigdatalite.localdomain:8088/proxy/application_1402410026954_0004/ Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1402410026954_0004 Hadoop job information for Stage-1: number of mappers: 1;总 MapReduce 作业 = 1 Launching Job 1 out of 1 在编译时确定的 reduce 任务数:1 为了更改 reducer 的平均负载(以字节为单位):set hive.exec.reducers.bytes.per.reducer= In为了限制最大reducer数量:set hive.exec.reducers.max=为了设置恒定数量的reducer:set mapred.reduce.tasks=Starting Job = job_1402410026954_0004, Tracking URL = http://bigdatalite.localdomain :8088/proxy/application_1402410026954_0004/ Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1402410026954_0004 Stage-1的Hadoop作业信息:mapper数量:1; number of reducers: 1 2014-06-10 13:07:28,078 Stage-1 map = 0%, reduce = 0% 2014-06-10 13:07:39,983 Stage-1 map = 100%, reduce = 0% 2014-06-10 13:07:41,071 Stage-1 map = 0%, reduce = 0% 2014-06-10 13:08:18,527 Stage-1 map = 100%, reduce = 100% Ended Job = job_1402410026954_0004 with errors Error during job, obtaining debugging information... Examining task ID: task_1402410026954_0004_m_000000 (and more) from job job_1402410026954_0004减速器数量:1 2014-06-10 13:07:28,078 Stage-1 map = 0%,reduce = 0% 2014-06-10 13:07:39,983 Stage-1 map = 100%,reduce = 0% 2014 -06-10 13:07:41,071 Stage-1 map = 0%,reduce = 0% 2014-06-10 13:08:18,527 Stage-1 map = 100%,reduce = 100% Ended Job = job_1402410026954_00 错误在作业期间,获取调试信息...正在检查任务 ID:task_1402410026954_0004_m_000000(以及更多)来自作业 job_1402410026954_0004

Task with the most failures(4): ----- Task ID: task_1402410026954_0004_m_000000失败次数最多的任务(4): ----- 任务ID:task_1402410026954_0004_m_000000

tipid=task_1402410026954_0004_m_000000 ----- Diagnostic Messages for this Task: Error: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Na tipid=task_1402410026954_0004_m_000000 ----- 此任务的诊断消息:错误:java.lang.RuntimeException:在 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) 处配置对象时出错。 hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java: 426) 在 org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) 在 org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) 在 java.security.AccessController.doPrivileged(本机方法)在 javax.security.auth.Subject.doAs(Subject.java:415) 在 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) 在 org.apache.hadoop.mapred.YarnChild。 main(YarnChild.java:163) 引起: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Na tiveMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.had tiveMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf (ReflectionUtils.java:106) ... 9 more 引起:java.lang.RuntimeException: Error in configure object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) at org.apache.hadoop .util.ReflectionUtils.setConf(ReflectionUtils.java:75) 在 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 在 org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ) ... 14 更多 引起: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl。 invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.had oop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ... 17 more Caused by: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:134) ... 22 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: Class com.cloudera.hive.serde.JSONSerDe not found at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:314) at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:333) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:103) ... 22 more Caused by: java.lang.ClassNotFoundException: Class com.cloudera.hive.serde.JSONSerDe not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801) at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:284) ... 24 more oop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ... 17 more 引起:java.lang.RuntimeException:地图操作符初始化在org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure失败(ExecMapper.java:134) ... 22 导致:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: Class com.cloudera.hive.serde.JSONSerDe not found at org. apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:314) 在 org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:333) 在 org.apache.hadoop。 hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:103) ... 22 导致:java.lang.ClassNotFoundException:在 org.apache.hadoop 找不到类 com.cloudera.hive.serde.JSONSerDe .conf.Configuration.getClassByName(Configuration.java:1801) 在 org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:284) ... 24 更多

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Job 0: Map: 1 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 0 msec失败:执行错误,从 org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Job 0: Map: 1 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAIL 总 MapReduce CPU 时间返回代码 2花费:0 毫秒

Any thoughts?有什么想法吗?

将所需的库也复制到 hadoop lib 文件夹中,解决了这个问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM