I'm trying to parse a rather simple json file using Pig and the Twitter's elephant-bird library, but it turns into a very painfull debugging process. ...
I'm trying to parse a rather simple json file using Pig and the Twitter's elephant-bird library, but it turns into a very painfull debugging process. ...
Can anyone help me figure out why i am getting error while using REGISTER to register the jar file 'elephant bird' to load json data: I work in the l ...
Maybe this is an easy question but, I am having a difficult time resolving the issue. At this time, I have an pseudo-distributed HDFS that contains re ...
I have a data pipeline that writes protobufs into an HDFS and now I need a way to query that data. I stumbled upon elephant-bird and hive and have bee ...
I am not able to deserialize the protobuf data that has repeated string in it using elephant-bird 4.14 with Hive. This seems to be because repeated st ...
My problem is similar to this unanswered question : [https://stackoverflow.com/questions/42140344/elephantbird-dependency-jars][1] i have regis ...
Using Elephantbird JsonLoader I'm able to load the data if the record is in this format: but the actual data is in the below format: (enclosed in s ...
I want to parse a Billionaires JSON dataset into Pig.The JSON file can be found here. Here is what each entry has: Attempt 1 I tried loading thi ...
This is in continuation to this My new pig script is: This code works in local mode: pig -x local user_posts.pig But it fails in mapreduce mode: ...
I have a pig script using elephant bird json loader. I have the output right but the field names are wrong. My output has val_n instaed of the fie ...
Thanks in advance. Currently we are trying to create Hive table by using Protocol Buffers byte data. We have followed all possible steps of creating ...
I have been facing this issue from long time. I tried to solve this but i couldn't. I need some experts advice to solve this. I am trying to load a s ...
I have problem with simply loading data to test and analyze. I'm using https://www.reddit.com/r/datasets/comments/3oiv9z/reddit_september_comment ...
I am trying to parse a nested JSON object whose levels can contain bags and/or tuples using Elephant Bird in Pig. Referencing columns at the fourth le ...
I have a JSON file with varying schema. For which I have written a pig script that makes use of twitter's elephant-bird library to load the JSON da ...
The README file for elephant-bird mentions: Thrift 0.5.0, 0.6.0, 0.7.0, greater versions than 0.9 are provided via thrift9 maven profile I am u ...
I am running a PIG script in mapreduce mode. The script reads RCFile (containing Thrift serialized data stored in GZIP compressed format), deserialize ...
I want to parse a string of complex JSON in Pig. Specifically, I want Pig to understand my JSON array as a bag instead of as a single chararray. I fou ...
Upon request in comment, this is the last dump of http://textuploader.com/ay4a8 Update #2 I checked the online repo mentioned in the error h ...
I wanted to process twitter json object with pig using elephant-bird jars for which i wrote the pig script as below. REGISTER '/usr/lib/pig/l ...