I am new to Hive, got some stuff to parse logs of the format
[Time Stamp] {Complex JSON data}
As I see from my searches so far, There are JSON Serde's available.
Can I extend those JSON Serde code to suit my need ? If so which JSON serde code would be better to choose ?
If this approach is not good, Any other pointers?
Thanks
Instead of using any other open source serde,
I found writing a serde myself was much simpler. Apart from the boiler plate code, I just had to write my business logic in deserialize method, that worked like a charm.
This link was very helpful. http://blog.cloudera.com/blog/2012/12/how-to-use-a-serde-in-apache-hive/
Also, I tried with UDTF, that too worked smoothly, found that serde was much faster.
Hope this helps someone
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.