简体   繁体   English

是否可以使用 monit 工具监控 hadoop、hbase 和 yarn?

[英]Is it possible to monitor hadoop, hbase and yarn using monit tool?

I want to monitor some services such that, those services needs to restart when they goes down and I found an amazing tool monit .我想监控一些服务,当这些服务出现故障时需要重新启动,我发现了一个很棒的工具monit It works fine for Zookeeper since I got a condition like matching "QuorumPeerMain" as shown below in monitrc file它适用于Zookeeper ,因为我得到了一个matching "QuorumPeerMain"的条件,如下所示的monitrc文件

check process Zookeeper matching "QuorumPeerMain"
        start program = "path/to/zkServer.sh start"
        stop program  = "path/to/zkServer.sh stop"

In the sameway, I want to monitor these : hadoop, yarn and hbase同样,我想监控这些: hadoop、yarn 和 hbase

check process Hadoop matching "?"
        start program = "startorstop.sh start"  #equivalent to start-dfs.sh
        stop program  = "startorstop.sh stop"   #equivalent to stop-dfs.sh

What should be written in the place of ?的地方应该写什么

These are the questions这些是问题

  • In the hadoop case, there may be a chance any one of these going down NameNode , DataNode , SecondaryNameNode .在 hadoop 的情况下,其中任何一个都可能出现NameNodeDataNodeSecondaryNameNode Monit Doc says that "The top-most matching parent with highest uptime is selected" . Monit Doc“选择了正常运行时间最长的最匹配的父级” For eg, If DataNode goes down, it still considers NameNode and won't try to restart hadoop .例如,如果 DataNode 出现故障,它仍然会考虑 NameNode 并且不会尝试重新启动hadoop Another option was using pid file and I am not able to find hadoop's pid file in /var/run/另一种选择是使用 pid 文件,但我无法在/var/run/中找到 hadoop 的 pid 文件
  • I want something like a top to bottom approach (not exactly).我想要一种从上到下的方法(不完全是)。 After starting zookeeper only, I want to start the remaining services like hbase , hadoop and yarn仅启动zookeeper后,我想启动剩余的服务,如hbasehadoopyarn

I got a way to start NameNode , DataNode , SecondaryNameNode independently using shell scripts ie, hadoop-daemon.sh So in my monit conf NameNode looks like我有一种方法可以使用 shell 脚本独立启动NameNodeDataNodeSecondaryNameNode ,即hadoop-daemon.sh所以在我的 monit conf NameNode看起来像

Credits to @OneCricketeer for the comment, So that I can find a way to start these process independently感谢@OneCricketeer 的评论,这样我就可以找到一种方法来独立启动这些过程

check process NameNode matching "NameNode"
    start program = "startorstop.sh start"  #hadoop-daemon.sh start namenode
    stop program  = "startorstop.sh stop"   #hadoop-daemon.sh stop namenode
    group hadoop

and for another part of my question, I got depends option.对于我问题的另一部分,我得到了depends选项。 For more detail take a look here Service Dependencies .有关更多详细信息,请查看此处的Service Dependencies In my case, I wanted to restart HRegionServer whenever DataNode goes down.就我而言,我想在DataNode出现故障时重新启动HRegionServer So below conf works所以下面的conf有效

check process HRegionServer matching "HRegionServer"
    start program = "startorstop.sh start"  #hbase-daemon.sh start regionserver
    stop program =  "startorstop.sh stop"   #hbase-daemon.sh stop regionserver
    depends on DataNode

check process DataNode matching "DataNode"
    start program = "startorstop.sh start"  #hbase-daemon.sh start datanode
    stop program =  "startorstop.sh stop"   #hbase-daemon.sh stop datanode

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM