Nagios监控ESXI主机系统、硬件
使用Nagios的扩展插件check_esx3.pl、check_esxi_hardware.py来监控VMware ESX服务器,check_esx3.pl主要监控系统资源,例如cpu、内存等使用情况,check——esxi_hardware.py主要监控的是硬件资源。既可以实现监控单台ESX(i)服务器,也可以监控VirtualCenter/vCenter服务器集群。当企业中已经部署虚拟数据中心(vCenter)时,应该监控vCenter而不是单台ESX/vSphere服务器。
安装:
1、安装关联库文件
[root@localhost ~]# yum -y install gcc openssl libssl libssl-dev per-doc rpm
2、下载vmware vsphere sdk for perl工具包:
check_esx3.pl需要安装vmware vsphere sdk for perl工具包
https://my.vmware.com/group/vmware/details?productId=491&downloadGroup=SDKPERL600需要注册登陆,根据你的操作系统下载对应的32bit/64bit版本。
[root@localhost src]# tar zxvf VMware-vSphere-SDK-for-Perl-4.0.0-161974.x86_64.tar.gz
[root@localhost src]# cd vmware-vsphere-cli-distrib
[root@localhost vmware-vsphere-cli-distrib]# ./vmware-install.pl
3、安装check_esx3.pl
check_esx3.pl存放至nagios安装目录下的libexec目录中:
http://exchange.nagios.org/directory/Plugins/Operating-Systems/*-Virtual-Environments/VMWare/Vmware-ESX-%26-VM-host/details
[root@localhost src]# mv check_esx3-0.5.pl /usr/local/nagios/libexec/
[root@localhost src]# cd /usr/local/nagios/libexec/
[root@localhost libexec]# chmod +x check_esx3-0.5.pl
[root@localhost libexec]# ./check_esx3-0.5.pl -help
Can't locate Nagios/Plugin.pm in @INC .........
[root@localhost libexec]#
安装 Nagios::Plugin插件
[root@localhost libexec]# perl -MCPAN -e 'install Nagios::Plugin'
[root@localhost libexec]# ./check_esx3-0.5.pl -help
Can't locate Nagios/Plugin.pm in @INC .........
[root@localhost libexec]#
安装 rpmforge-release
http://download.slogra.com/rpmforge-release-0.5.2-2.el5.rf.i386.rpm
[root@localhost libexec]# wget http://download.slogra.com/rpmforge-release-0.5.2-2.el5.rf.i386.rpm
[root@localhost libexec]# rpm -ivh rpmforge-release-0.5.2-2.el5.rf.i386.rpm
安装perl组件
[root@localhost libexec]# yum -y install perl-Params-Validate perl-Math-Calc-Units perl-Regexp-Commonperl-Class-Accessor perl-Config-Tiny perl-Nagios-Plugin.noarch
[root@localhost libexec]# ./check_esx3-0.5.pl -help
Can't locate LWP/UserAgent.pm in @INC
[root@localhost libexec]#
安装插件Bundle::LWP
[root@localhost libexec]# perl -MCPAN -eshell
cpan> install Bundle::LWP
Do you want to modify/update your configuration (y|n) ? [no] no
Shall I follow them and prepend them to the queue of modules we are processing right now? [yes] yes
cpan> exit
这里提示要不要对原有网络配置进行更新修改,我们选择no,这里提示须跟随他们和他们预队列中我们现在正在处理的模块吗,直接输入yes.
[root@localhost libexec]# ./check_esx3-0.5.pl -help
Can't locate Zlib/Compress.pm in @INC
[root@localhost libexec]# perl -MCPAN -e 'install Compress::Zlib'
安装脚本使用cpan安装perl模块,会有一些perl模块安装不上,这些安装不上的模块,得手动使用cpan去安装,若还安装不上那么就用yum去安装,例如 UUID,
error:installed manuallyfor use by vSphere CLI:
UUID 0.03 or newer
解决:
[root@localhost libexec]# yum install perl-SOAP-Lite perl-Data-Dumpperl-Class-MethodMaker perl-Crypt-SSLeay perl-libxml-perlperl-XML-LibXML-Common libuuid-devel uuid-perl -y
[root@localhost libexec]# perl -MCPAN -e'install UUID'
[root@localhost libexec]# ./check_esx3.pl -H 10.10.2.233 -u root-p 'justin' -l cpu
CHECK_ESX3.PL CRITICAL -Server version unavailable at 'https://10.10.2.233:443/sdk/vimService.wsdl' at/usr/share/perl5/VMware/VICommon.pm line 545.
[root@localhost libexec]# vim check_esx3.pl
#!/usr/bin/perl -w$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME}=0;
#
# Nagios plugin to monitor vmware esxservers
[root@nagioslibexec]# ./check_esx3.pl -H 10.10.2.233 -u root -p 'justin' -l cpu
CHECK_ESX3.PLCRITICAL - Server version unavailable at 'https://10.10.2.233:443/sdk/vimService.wsdl' at /usr/lib/perl5/5.8.8/VMware/VICommon.pm line 545.
对这个问题的解决办法是添加一个参数,,以check_esx3.pl告诉LWP的,可以忽略不计,自签名的SSL证书(因为他们的ESX / i服务器的默认),根据提示在 check_esx3.pl中添加一行 "$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;"
[root@nagioslibexec]# vim check_esx3.pl
$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;
#
# Nagios plugin to monitor vmware esx servers
#
# License: GPL
[root@nagioslibexec]# ./check_esx3.pl -H 10.10.2.233 -u root -p 'justin' -l cpu
CHECK_ESX3.PL OK -cpu usage=55.00 MHz (0.12%) | cpu_usagemhz=55.00Mhz;; cpu_usage=0.12%;;
check_esx3.pl参数使用可以通过./check_esx3.pl --help查看
4、安装check_esxi_hardware.py
check_esxi_hardware.py需要安装python、python的扩展包pywbem、你的Esxi主机的443,5989端口必须对nagios监控端开放,
[root@nagioslibexec]# wget
[root@nagioslibexec]# chown nagios.nagios check_esxi_hardware.py
[root@nagioslibexec]# chmod 755 check_esxi_hardware.py
[root@nagioslibexec]# ./check_esxi_hardware.py
Traceback (most recent call last):
File "./check_esxi_hardware.py", line 222, in <module>
import pywbem
ImportError: No module named pywbem
[root@nagioslibexec]# ./check_esxi_hardware.py -h
Traceback (most recent call last):
File "./check_esxi_hardware.py", line 222, in <module>
import pywbem
ImportError: No module named pywbem</module></module>
pywbem模块没有安装,安装python的第三方模块
http://downloads.sourceforge.net/project/pywbem/pywbem/pywbem-0.7/pywbem-0.7.0.tar.gz?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fpywbem%2Ffiles%2Fpywbem%2F&ts=1299742557&use_mirror=voxel
1234567 [root@nagioslibexec]# wget
[root@nagioslibexec]# tar -zxvf pywbem-0.7.0.tar.gz
[root@nagioslibexec]# cd pywbem-0.7.0
[root@nagioslibexec]# python setup.py build
[root@nagioslibexec]# python setup.py install --record files.txt
[root@nagioslibexec]# ./check_esxi_hardware.py -H 10.10.2.233-U nagios -P nagios -V dell
OK - Server: Dell Inc. PowerEdge R610 s/n: XXXXXX System BIOS: XXXXXXXXXXX
如果使用pywbem-0.8.0版本可能导致我们的插件无法使用,python setup.py install --record files.txt 记录安装目录的目的就是为了方便卸载插件,cat files.txt | xargs rm -rf
使用check_esx3.pl和check_esxi_hardware.py都只需要在Esxi主机上建立只读的用户名和密码即可。使用./check——esx3.pl -helo和check_esxi_hardware.py -help可以查看插件使用语法,
[root@localhost libexec]# ./check_esxi_hardware.py -help
Usage: check_esxi_hardware.py https://hostname user password system [verbose]
example: check_esxi_hardware.py https://my-shiny-new-vmware-server root fakepassword dell
or, using new style options:
usage: check_esxi_hardware.py -H hostname -U username -P password [-V system -v -p -I XX]
example: check_esxi_hardware.py -H my-shiny-new-vmware-server -U root -P fakepassword -V auto -I uk
or, verbosely:
usage: check_esxi_hardware.py --host=hostname --user=username --pass=password [--vendor=system --verbose --perfdata --html=XX]
Options:
--version show program's version number and exit
-h, --help show this help message and exit
Mandatory parameters:
-H HOST, --host=HOST
report on HOST
-U USER, --user=USER
user to connect as
-P PASS, --pass=PASS
password, if password matches file:<path>, first line
of given file will be used as password
Optional parameters:
-V VENDOR, --vendor=VENDOR
Vendor code: auto, dell, hp, ibm, intel, or unknown
(default)
-v, --verbose print status messages to stdout (default is to be
quiet)
-p, --perfdata collect performance data for pnp4nagios (default is
not to)
-I XX, --html=XX generate html links for country XX (default is not to)
-t TIMEOUT, --timeout=TIMEOUT
timeout in seconds - no effect on Windows (default =
no timeout)
comma-separated list of elements to ignore
--no-power don't collect power performance data
--no-volts don't collect voltage performance data
--no-current don't collect current performance data
--no-temp don't collect temperature performance data
--no-fan don't collect fan performance data
[root@localhost libexec]#
给Esxi主机设置只读用户
1)先登录Esxi主机,在“本地用户和组”标签中,空白处右键“添加”,即可添加用户。
如果是esxi6.0以上 需要修改下密码复杂度,需要修改security中密码的值为 retry=3 min=8,8,8,7,6
1、登录ESXi Host后,执行如下命令:
#vi /etc/pam.d/passwd
2、找到里面的内容,如下:
password requisite /lib/security/$ISA/pam_passwdqc.so retry=3 min=8,8,8,7,6
password requisite /lib/security/$ISA/pam_passwdqc.so retry=N min=N0,N1,N2,N3,N4
说明���
retry=3的意思是说可以尝试输入3次密码;
N0 = 12,表示一种字符即可,但是最短也需要12位;
N1 = 10,密码至少要有2种字符类型,最短10位;
N2 = 8,密码最短需要8位;
N3 = 8,要求大小写和数字3种字符,最短8位;
N4 = 7,要求大小写、数字和特殊字符,且长度最少为7位;
2)将nagios用户设置成“只读角色”。在“权限”标签中,空白处右键“添加权限”,然后按下图操作
测试
[root@nagios libexec]# ./check_esxi_hardware.py -H 10.10.2.233 -U nagios -P nagios -V dell
UNKNOWN: Authentication Error
[root@localhost libexec]# ./check_esxi_hardware.py -H 10.15.98.204 -U nagios -P nagios -V auto -t 90 -i "IPMI SEL"
Traceback (most recent call last):
File "./check_esxi_hardware.py", line 593, in <module>
wbemclient = pywbem.WBEMConnection(hosturl, (user,password), no_verification=True)
TypeError: __init__() got an unexpected keyword argument 'no_verification'
[root@localhost libexec]#
认证失败。在网上查到原因是Esxi版本不同差异导致。
解决方法:
ssh登陆Esxi主机,Esxi主机开启ssh功能点此
~ # cat /etc/security/access.conf
# This file is autogenerated and must not be edited.
+:dcui:ALL+:root:ALL
+:vpxuser:ALL
+:vslauser:ALL
-:nagios:ALL
-:ALL:ALL
将“-:nagios:ALL”去掉,在第二行加上“+:nagios:sfcb”,修改成如下
~ # cat /etc/security/access.conf
# This file is autogenerated and must not be edited.
+:dcui:ALL+:root:ALL
+:nagios:sfcb
+:vpxuser:ALL
+:vslauser:ALL
-:ALL:ALL
这种方式适合在不经常添加用户的情况下使用,只改一次即可;但是经常加用户可能会导致access.conf变化,需要设置计划任务添加“+:nagios:sfcb”
[root@nagioslibexec]# ./check_esxi_hardware.py -H 10.10.2.233-U nagios -P nagios -V dell
OK - Server: Dell Inc. PowerEdge R610 s/n: XXXXXX System BIOS: XXXXXXXXXXX