- This is a test to check status of hdfs in a hadoop cluster
- Existing checks did not give me what I needed, so I hacked this little script together. It is not pretty, but works for me. YMMV
- The test uses the Hadoop hdfs admin web page, normally found at; http://hdfs-namenode:50070/dfshealth.jsp I am just parsing output from this webpage using regexps
- New version, does not require links, Download Here check_hadoop0.4
- Rename to .pl and make executable
- Requires the nagios perl modules, utils.pm and a few standard perl-modules
- Tested with hadoop 0.20.2
- Gives performancedata for unreplicated blocks, data in hdfs, nodes OK/Dead and number of files/directories/blocks in the hdfs.
check_hadoop_hdfs v. 0.4
Copyright (c) 2011 Jon Ottar Runde, jru@rundeconsult.no
See http://www.rundeconsult.no/?p=38 for updated versions and documentation
Usage: -w <warn> -c <crit> -x <Unreplicated blocks warn> -u <Unreplicated blocks crit> -H <Hostname> -p <Port> [-v version] [-h help]Checks several Hadoop hdfs-parameters
-H (–Host)
-p (–Port)
-w (–warning) = warning for DFS Usage
-c (–critical) = critical limit for DFS Usage (w < c )
-x (–unreplicatedwarn) = Warning limit for Unreplicated blocks
-u (–unreplicatedcritical) = Error limit for Unreplicated blocks
-h (–help)
-v (–version)
Example Nagios-config
define service{
use generic_service
service_description Hadoop_Extended Check
check_command check_hadoop_extended
hosts namenode.company.com
}
define command{
command_name check_hadoop_extended
command_line $USER1$/check_hadoop.pl -H namenode.company.com -p 50070 -w 5 -c 10 -x 100 -u 1000
}