Tag Archives: Opsview

Nagios hadoop hdfs checkNagios hadoop hdfs sjekk

  • This is a test to check status of hdfs in a hadoop cluster
  • Existing checks did not give me what I needed, so I hacked this little script together. It is not pretty, but works for me. YMMV
  • The test uses the Hadoop hdfs admin web page, normally found at; http://hdfs-namenode:50070/dfshealth.jsp I am just parsing output from this webpage using regexps
  • New version, does not require links, Download Here  check_hadoop0.4
  • Rename to .pl and make executable
  • Requires the nagios perl modules, utils.pm and a few standard perl-modules
  • Tested with hadoop 0.20.2
  • Gives performancedata for unreplicated blocks, data in hdfs, nodes OK/Dead and number of files/directories/blocks in the hdfs.

check_hadoop_hdfs v. 0.4
Copyright (c) 2011 Jon Ottar Runde, jru@rundeconsult.no
See http://www.rundeconsult.no/?p=38 for updated versions and documentation
Usage: -w <warn> -c <crit> -x <Unreplicated blocks warn> -u <Unreplicated blocks crit> -H <Hostname> -p <Port> [-v version] [-h help]

Checks several Hadoop hdfs-parameters
-H (–Host)
-p (–Port)
-w (–warning)   = warning for DFS Usage
-c (–critical)  = critical limit for DFS Usage  (w < c )
-x (–unreplicatedwarn) = Warning limit for Unreplicated blocks
-u (–unreplicatedcritical) = Error limit for Unreplicated blocks
-h (–help)
-v (–version)

Example Nagios-config

define service{
use                          generic_service
service_description    Hadoop_Extended Check
check_command       check_hadoop_extended
hosts                       namenode.company.com
}

define command{
command_name        check_hadoop_extended
command_line        $USER1$/check_hadoop.pl -H namenode.company.com  -p 50070 -w 5 -c 10 -x 100 -u 1000
}

 

  • Dette er en sjekk for å undersøke status på hdfs i et hadoop cluster
  • Eksisterende sjekker ga ikkje den informasjonen eg trengde, så eg hacka sammen en liten test. Det er ikkje vakker kode, men det fungerer for meg.
  • Sjekken bruker admin-web-sida til hdfs, som normalt fins på http://hdfs-namenode:50070/dfshealth.jsp Dette er berre enkel regex-parsing av utput fra denne websida
  • Ny versjon, som ikkje trenger links, Last ned her check_hadoop0.4
  • Krever nagios perl-moduler, utils.pm
  • Testa mot hadoop 0.20.2
  • Gir performancedata for b.l.a. størrelse på hdfs, u-replikerte blokker, noder ok/dead etc

check_hadoop_hdfs v. 0.4
Copyright (c) 2011 Jon Ottar Runde, jru@rundeconsult.no
See http://www.rundeconsult.no/?p=38 for updated versions and documentation
Usage: -w <warn> -c <crit> -x <Unreplicated blocks warn> -u <Unreplicated blocks crit> -H <Hostname> -p <Port> [-v version] [-h help]

Checks several Hadoop hdfs-parameters
-H (–Host)
-p (–Port)
-w (–warning)   = warning for DFS Usage
-c (–critical)  = critical limit for DFS Usage  (w < c )
-x (–unreplicatedwarn) = Warning limit for Unreplicated blocks
-u (–unreplicatedcritical) = Error limit for Unreplicated blocks
-h (–help)
-v (–version)

Eksempel Nagios-config

define service{
use                          generic_service
service_description    Hadoop_Extended Check
check_command       check_hadoop_extended
hosts                       namenode.company.com
}

define command{
command_name        check_hadoop_extended
command_line        $USER1$/check_hadoop.pl -H namenode.company.com  -p 50070 -w 5 -c 10 -x 100 -u 1000
}