简体   繁体   中英

PIG Script Error: java.lang.NoSuchMethodError: org.apache.thrift.protocol.TProtocol.getScheme

I am running a PIG script in mapreduce mode. The script reads RCFile (containing Thrift serialized data stored in GZIP compressed format), deserializes it using a UDF, extracts certain fields from the Thrift struct, and stores them.

Some of the mappers fail with following error:

2015-12-23 03:07:45,638 FATAL [Thread-5] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.NoSuchMethodError: org.apache.thrift.protocol.TProtocol.getScheme()Ljava/lang/Class;
at com.xxx.yyy.thrift.dto.LatLong.read(LatLong.java:553)
at com.twitter.elephantbird.util.ThriftUtils.readSingleFieldNoTag(ThriftUtils.java:318)
at com.twitter.elephantbird.util.ThriftUtils.readFieldNoTag(ThriftUtils.java:352)
at com.twitter.elephantbird.mapreduce.input.RCFileThriftTupleInputFormat$TupleReader.getCurrentTupleValue(RCFileThriftTupleInputFormat.java:74)
at com.twitter.elephantbird.pig.load.RCFileThriftPigLoader.getNext(RCFileThriftPigLoader.java:46)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:204)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Here's my script:

REGISTER '/user/ameya/libs/geo-analysis-1.0.0-SNAPSHOT.jar';
REGISTER '/user/ameya/libs/libthrift-0.8.0.jar';
REGISTER '/user/ameya/libs/thrift-0.8-types-1.1.29-SNAPSHOT.jar';
REGISTER '/user/ameya/libs/libs/elephant-bird-pig-4.7.jar';
REGISTER '/user/ameya/libs/libs/elephant-bird-rcfile-4.7.jar';
REGISTER '/user/ameya/libs/libs/elephant-bird-core-4.7.jar';
REGISTER '/user/ameya/libs/libs/elephant-bird-hadoop-compat-4.7.jar';
REGISTER '/user/ameya/libs/libs/hive-0.4.1.jar';
REGISTER '/user/ameya/libs/libs/libs/hive-serde-0.13.3.jar';

SET output.compression.enabled true;
SET output.compression.codec org.apache.hadoop.io.compress.GzipCodec;

thrift = LOAD '$input' USING com.twitter.elephantbird.pig.load.RCFileThriftPigLoader('com.xxx.yyy.thrift.dto.LatLong');

final = FOREACH thrift GENERATE (requestLatLong is not null ? requestLatLong.latitude : null) AS req_ll_lat,
            (requestLatLong is not null ? requestLatLong.longitude : null) AS req_ll_lng;

STORE final INTO '$output';

I am using libthrift-0.8.0.jar, where class TProtocol.java has indeed defined getScheme() method (with public access). Interestingly, not all the mappers fail, just a few of them; but that causes my job to fail. Could this be a CLASSPATH issue?

I tried searching for this issue, but could not find relevant answers. Can someone please help me get some leads to fix this?

Found the reason. The class "org.apache.thrift.protocol.TProtocol" was defined in two jars, ie libthrift-0.8.0.jar and hive-0.4.1.jar. The one in hive-0.4.1.jar did not have method getScheme() defined. When it picked up hive-0.4.1.jar in the classpath first, the mappers were not able to find method getScheme().

I am not sure why the behavior was not consistent across all mappers. Any comments to explain that would be helpful.

I replaced hive-0.4.1.jar with hive-exec-0.13.3.jar and the issue got resolved.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM