简体   繁体   中英

Hive query error with Json

I'm creating a table using the twitter cloudera example, while I've successfully created the table and get the data, I'm encountering a problem.

I can perform a select * from tweets; and returned the data, but when I go more deeper like a count(*) I'm receiving an error.

Here's the table created:

ADD JAR /cdh-twitter-example/hive-serdes/target/hive-serdes-1.0-SNAPSHOT.jar; CREATE EXTERNAL TABLE tweets ( id BIGINT, created_at STRING,
source STRING, favorited BOOLEAN, retweet_count INT,
retweeted_status STRUCT< text:STRING, user:STRUCT>, entities STRUCT< urls:ARRAY>, user_mentions:ARRAY>, hashtags:ARRAY>>, text STRING, user STRUCT< screen_name:STRING, name:STRING, friends_count:INT, followers_count:INT, statuses_count:INT, verified:BOOLEAN, utc_offset:INT, time_zone:STRING>, in_reply_to_screen_name STRING ) ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe' LOCATION '/user/flume/tweets';

Here's the error and stack trace:

hive> select count(*) from tweets; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer= In order to limit the maximum number of reducers: set hive.exec.reducers.max= In order to set a constant number of reducers: set mapred.reduce.tasks= Starting Job = job_1402410026954_0004, Tracking URL = http://bigdatalite.localdomain:8088/proxy/application_1402410026954_0004/ Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1402410026954_0004 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2014-06-10 13:07:28,078 Stage-1 map = 0%, reduce = 0% 2014-06-10 13:07:39,983 Stage-1 map = 100%, reduce = 0% 2014-06-10 13:07:41,071 Stage-1 map = 0%, reduce = 0% 2014-06-10 13:08:18,527 Stage-1 map = 100%, reduce = 100% Ended Job = job_1402410026954_0004 with errors Error during job, obtaining debugging information... Examining task ID: task_1402410026954_0004_m_000000 (and more) from job job_1402410026954_0004

Task with the most failures(4): ----- Task ID: task_1402410026954_0004_m_000000

tipid=task_1402410026954_0004_m_000000 ----- Diagnostic Messages for this Task: Error: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Na tiveMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.had oop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ... 17 more Caused by: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:134) ... 22 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: Class com.cloudera.hive.serde.JSONSerDe not found at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:314) at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:333) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:103) ... 22 more Caused by: java.lang.ClassNotFoundException: Class com.cloudera.hive.serde.JSONSerDe not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1801) at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:284) ... 24 more

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Job 0: Map: 1 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 0 msec

Any thoughts?

将所需的库也复制到 hadoop lib 文件夹中,解决了这个问题。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM