Mapreduce error with parquet format

Question

I'm trying to run mapreduce job. My files are in a parquet format.

I'm getting the following error:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/thrift/TException
at parquet.format.converter.ParquetMetadateConverter.readParquetMetadata(ParquetMetadateConverter.java:268)
at parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:271)
at parquet.hadoop.ParquetFileReader.readSummeryFile(ParquetFileReader.java:200)
at parquet.hadoop.ParquetFileReader.readAllFootersInParallelUsingSummeryFiles(ParquetFileReader.java:99)
at parquet.hadoop.ParquetInputFormat.getFooters(ParquetInputFormat.java:354)
at parquet.hadoop.ParquetInputFormat.getFooters(ParquetInputFormat.java:339)
at parquet.hadoop.ParquetInputFormat.getSplits(ParquetInputFormat.java:246)
...

I tried to add the jar that contains the TException with --libjars my_path/libthrift-0.9.0.jar and I still get the same error.

Answer 1

Please try setting the HADOOP_CLASSPATH parameter to point to a libthrift.jar file that matches the version you need.

For example:

export HADOOP_CLASSPATH=/var/lib/hdfs/libthrift-0.9.jar

Hope this helps!

Mapreduce error with parquet format

Question

1 answers

solution1
3 ACCPTED 2014-04-01 08:53:07

Mapreduce error with parquet format

Question

1 answers

solution1 3 ACCPTED 2014-04-01 08:53:07

solution1
3 ACCPTED 2014-04-01 08:53:07