简体   繁体   中英

Logging & Monitoring for Hive Batch Jobs

This is my first question in this forum. I am writing hive batch job logs into a hive log table as-soon-as each step completed. I am using INSERT INTO TABLE for writing logs into hive table. In hive, multiple records are created for each batch job ID, so I am creating a View to combine logging data collected before using in monitoring tool. Can you please suggest any better solution to achieve this?

Notes:

  1. My batch job having multiple steps and I like to collect logs from each step
  2. I don't want to use UPDATE
  3. I am unable to upload image. Batch Job -> Logs -> Hive -> Monitoring

Here is one of the reference architecture I can suggest. You can still use Hive for logging, but use SERDEPROPERTIES to integrate with HBase .

Benefits:

  • Data will be stored in HBase, which will allow to decide a KEY for data override (example: Batch Job ID)
  • HBase will maintain the versions
  • You can able to query Hive the way you normally access Hive tables
  • Real-time dashboard using HBase data

High-Level Diagram: 在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM