Nagios利用check_dell_openmanage插件監控dell服務器

想用這個東西很久了,免得每天都有跑到機房去檢查每臺服務器的硬件狀態指示燈。本文采取直播的方式記錄,一邊配置一邊寫,經驗證成功的步驟貼在此處,供以後參考。廢話不多說,直接上吧。

1、下載check_dell_openmanage插件,地址:

http://exchange.nagios.org/components/com_mtree/attachment.php?link_id=85&cf_id=24


2、將插件cp到以下目錄:

/usr/local/nagios/libexec

3、先看看插件的用法:

[root@pcnnagios libexec]# ./check_dell_openmanage.pl -h
SNMP Dell OpenManage Monitor for Nagios version 1.3
 by Jason Ellison - infotek(at)gmail.com
Usage: ./check_dell_openmanage.pl [-v] -H <host> -C <snmp_community> [-2] | (-l login -x passwd)  [-P <port>] -T test|de      llom|dellom_storage|blade|global|chassis|custom [-t <timeout>] [-V] [-u <unknown_default>]
-v, --verbose
        print extra debugging information
-h, --help
        print this help message
-H, --hostname=HOST
        name or IP address of host to check
-C, --community=COMMUNITY NAME
        community name for the host's SNMP agent (implies v 1 protocol)
-2, --v2c
        use SNMP v2 (instead of SNMP v1)
-P, --port=PORT
        SNMPd port (Default 161)
-t, --timeout=INTEGER
        timeout for SNMP in seconds (Default: 5)
-V, --version
        prints version number
-u, --unknown_default=INT
        If attribute is not found then report the output as this number (i.e. -u 0)
-T, --type=test|dellom|dellom_storage|blade|global|chassis|custom
        This allows to use pre-defined system type
        Currently support systems types are:
              test (tries all OID's in verbose mode can be used to generate new system type)
              dellom (Dell OpenManage general detailed)
              dellom_storage (Dell OpenManage plus Storage Management detailed)
              blade (some features are on the chassis not the blade)
              global (only check the global health status)
              chassis (only check the system chassis health status)
              custom (intended for customization)

4、測試一下環境:

./check_dell_openmanage.pl -v 2C -C public -H 192.168.0.1 -T test

報錯了,提示:On the nagios server that will be running the plugin you must have the perl "Net::SNMP" module installed.有給出安裝方法:

perl -MCPAN -e shell

cpan> install Net::SNMP


       此處花了一點下心思,CPAN.pm我的系統中默認沒裝,經過一番摸索也裝起來了,此處僅貼上網上找來的衆大神給出的安裝步驟,並非我的環境:

參考資料:

http://blog.haohtml.com/archives/12708

http://www.cnblogs.com/mopmoq/archive/2009/04/06/1430210.html

[root@GM ~]#wget http://cpan.communilink.net/authors/id/A/AN/ANDK/CPAN-1.9600.tar.gz
[root@GM ~]# tar -zxvf CPAN-1.9600.tar.gz
[root@GM ~]#cd CPAN-1.9600
[root@GM CPAN-1.9600]# perl Makefile.PL
[root@GM CPAN-1.9600]# make
[root@GM CPAN-1.9600]# make install
[root@GM CPAN-1.9600]# perl -MCPAN -e shell
此處省略n行,稍微看了下,全選的默認和自動配置
cpan(1)> install Net::SNMP

5、OK,裝完再測測看:

[root@pcnnagios libexec]# ./check_dell_openmanage.pl -v 2C -C public -H 10.40.1.131 -T test
TEST MODE:
Alarm at 5
The Net::SNMP library is available on your server
Trying all preconfigured Dell OID's against target...
StorageManagementGlobalSystemStatus     (.1.3.6.1.4.1.674.10893.1.20.110.13.0)
RESULT: 3(ok)
chassisManufacturerName (1.3.6.1.4.1.674.10892.1.300.10.1.8.1)
RESULT: Dell Inc.
chassisModelName        (1.3.6.1.4.1.674.10892.1.300.10.1.9.1)
RESULT: PowerEdge 2950
chassisServiceTagName   (1.3.6.1.4.1.674.10892.1.300.10.1.11.1)
RESULT: 53DG52X
chassisSystemName       (1.3.6.1.4.1.674.10892.1.300.10.1.15.1)
RESULT: pcnexconn
operatingSystemOperatingSystemName      (1.3.6.1.4.1.674.10892.1.400.10.1.6.1)
RESULT: Microsoft Windows Server 2003, Enterprise Edition
operatingSystemOperatingSystemVersionName       (1.3.6.1.4.1.674.10892.1.400.10.1.7.1)
RESULT: Version 5.2 (Build 3790 : Service Pack 2) (x86)
systemStateACPowerCordStatusCombined    (.1.3.6.1.4.1.674.10892.1.200.10.1.36.1)
RESULT: NO RESPONSE
systemStateACPowerSwitchStatusCombined  (.1.3.6.1.4.1.674.10892.1.200.10.1.46.1)
RESULT: NO RESPONSE
systemStateAmperageStatusCombined       (.1.3.6.1.4.1.674.10892.1.200.10.1.15.1)
RESULT: NO RESPONSE
systemStateBatteryStatusCombined        (.1.3.6.1.4.1.674.10892.1.200.10.1.52.1)
RESULT: 3(ok)
systemStateChassisIntrusionStatusCombined       (.1.3.6.1.4.1.674.10892.1.200.10.1.30.1)
RESULT: 3(ok)
systemStateChassisStatus        (.1.3.6.1.4.1.674.10892.1.200.10.1.4.1)
RESULT: 3(ok)
systemStateCoolingDeviceStatusCombined  (.1.3.6.1.4.1.674.10892.1.200.10.1.21.1)
RESULT: 3(ok)
systemStateCoolingUnitStatusCombined    (.1.3.6.1.4.1.674.10892.1.200.10.1.44.1)
RESULT: 3(ok)
systemStateEventLogStatus       (.1.3.6.1.4.1.674.10892.1.200.10.1.41.1)
RESULT: 3(ok)
systemStateGlobalSystemStatus   (.1.3.6.1.4.1.674.10892.1.200.10.1.2.1)
RESULT: 3(ok)
systemStateMemoryDeviceStatusCombined   (.1.3.6.1.4.1.674.10892.1.200.10.1.27.1)
RESULT: 3(ok)
systemStatePowerSupplyStatusCombined    (.1.3.6.1.4.1.674.10892.1.200.10.1.9.1)
RESULT: 3(ok)
systemStatePowerUnitStatusCombined      (.1.3.6.1.4.1.674.10892.1.200.10.1.42.1)
RESULT: 3(ok)
systemStateProcessorDeviceStatusCombined        (.1.3.6.1.4.1.674.10892.1.200.10.1.50.1)
RESULT: 3(ok)
systemStateTemperatureStatusCombined    (.1.3.6.1.4.1.674.10892.1.200.10.1.24.1)
RESULT: 3(ok)
systemStateVoltageStatusCombined        (.1.3.6.1.4.1.674.10892.1.200.10.1.12.1)
RESULT: 3(ok)
Please email the results to Jason Ellison - [email protected]
To add this system to check_dell_openmanage, use something like the following:
        "pexxxx" => [
                'StorageManagementGlobalSystemStatus',
                'systemStateBatteryStatusCombined'
                'systemStateChassisIntrusionStatusCombined'
                'systemStateChassisStatus'
                'systemStateCoolingDeviceStatusCombined'
                'systemStateCoolingUnitStatusCombined'
                'systemStateEventLogStatus'
                'systemStateGlobalSystemStatus'
                'systemStateMemoryDeviceStatusCombined'
                'systemStatePowerSupplyStatusCombined',
                'systemStatePowerUnitStatusCombined',
                'systemStateProcessorDeviceStatusCombined',
                'systemStateTemperatureStatusCombined',
                'systemStateVoltageStatusCombined'
        ],
[root@pcnnagios libexec]#


6、看起來一切OK,來用nagios讓這廝發揮功效吧:

#'chech_dell_openmanage' command definition
define command{
        command_name    check_dell_openmanage
        command_line    $USER1$/check_dell_openmanage.pl -v 2C -H $HOSTADDRESS$ -C public -T $ARG1$
        }
define service{
        use                     generic-service
        host_name               hostname
        service_description     Check Dell Hardware
        check_command           check_dell_openmanage!dellom_storage
        }


7、檢查下配置:

nagioscheck ;service nagios reload


8、一切正常,看看成果吧;


(硬態 狀態)
當前的狀態:
  正常(OK)  
狀態信息:   Alarm at 5
The Net::SNMP library is available on your server
SNMP responses...
RESULT: systemStateChassisStatus
.1.3.6.1.4.1.674.10892.1.200.10.1.4.1 = 3(ok)
RESULT: systemStatePowerSupplyStatusCombined
.1.3.6.1.4.1.674.10892.1.200.10.1.9.1 = 3(ok)
RESULT: systemStateVoltageStatusCombined
.1.3.6.1.4.1.674.10892.1.200.10.1.12.1 = 3(ok)
RESULT: systemStateCoolingDeviceStatusCombined
.1.3.6.1.4.1.674.10892.1.200.10.1.21.1 = 3(ok)
RESULT: systemStateTemperatureStatusCombined
.1.3.6.1.4.1.674.10892.1.200.10.1.24.1 = 3(ok)
RESULT: systemStateMemoryDeviceStatusCombined
.1.3.6.1.4.1.674.10892.1.200.10.1.27.1 = 3(ok)
RESULT: systemStateChassisIntrusionStatusCombined
.1.3.6.1.4.1.674.10892.1.200.10.1.30.1 = 3(ok)
RESULT: systemStateEventLogStatus
.1.3.6.1.4.1.674.10892.1.200.10.1.41.1 = 3(ok)
RESULT: StorageManagementGlobalSystemStatus
.1.3.6.1.4.1.674.10893.1.20.110.13.0 = 3(ok)
Dell Status to Nagios Status mapping...
systemStateTemperatureStatusCombined: statuscode = OK
StorageManagementGlobalSystemStatus: statuscode = OK
systemStateEventLogStatus: statuscode = OK
systemStateMemoryDeviceStatusCombined: statuscode = OK
systemStatePowerSupplyStatusCombined: statuscode = OK
systemStateVoltageStatusCombined: statuscode = OK
systemStateCoolingDeviceStatusCombined: statuscode = OK
systemStateChassisIntrusionStatusCombined: statuscode = OK
systemStateChassisStatus: statuscode = OK
OK:
EXIT CODE: 0 STATUS CODE: OK
性能數據: 
當前嘗試:   1/3
最近檢查時間: 2013-06-22 23:04:23
檢測類型:   主動式
檢測等待時間/檢測時延:    0.770 / 0.195 秒
下次檢測計劃檢測時間:     2013-06-22 23:06:23
最近狀態改變時間:   2013-06-22 22:56:22
最後一次送出通知時間: N/A (通知次數 0)
抖動是否執行?
  未抖動
 抖動值(狀態變化率 0.00%)
處於計劃宕機時間? 
  沒有
最近更新:   2013-06-22 23:04:51
開啓主動檢查:
  啓用
開啓被動檢查:
  啓用
Obsessing:
  啓用
通知:
  啓用
事件處理: 
  啓用
抖動監測: 
  啓用


收工,洗洗睡咯!!

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章