問題發現:在nagios監控頁面發現對OMSA的信息收集出現問題,登陸到有問題的系統上,發現系統日誌信息如下:
Sep 26 11:10:48 localhost Server Administrator (Shared Library): Data Engine EventID: 0 A semaphore set has to be created but the system limit for the maximum number of semaphore sets has been exceeded
大概意思是說:由於系統最大信號數量的限制,Data Engine未能成功開啓。
這需要修改系統內核對於 semaphore sets 的設定。方法如下:
1、查看當前對於信號量的設定
- $ ipcs -l
- ------ Shared Memory Limits --------
- max number of segments = 4096
- max seg size (kbytes) = 67108864
- max total shared memory (kbytes) = 17179869184
- min seg size (bytes) = 1
- ------ Semaphore Limits --------
- max number of arrays = 128
- max semaphores per array = 250
- max semaphores system wide = 32000
- max ops per semop call = 32
- semaphore max value = 32767
- ------ Messages: Limits --------
- max queues system wide = 16
- max size of message (bytes) = 65536
- default max size of queue (bytes) = 65536
- $ sysctl -a | grep shm
- vm.hugetlb_shm_group = 0
- kernel.shmmni = 4096
- kernel.shmall = 4294967296
- kernel.shmmax = 68719476736
發現:
max queues system wide = 16,此項過小,我準備將 the maximum number of message queues (MSGMNI)調整爲16384,同時將 the maximum number of Semaphore Arrays 調整爲1024。
- sysctl -w kernel.msgmni=16384
- sysctl -w kernel.sem="250 32000 100 1024"
將此設置寫到配置文件中,將在以後都生效:
- echo "kernel.msgmni=16384" >> /etc/sysctl.conf
- echo "kernel.sem=\"250 32000 100 1024\"" >> /etc/sysctl.conf
再次查看設定:
- $ ipcs -l
- ------ Shared Memory Limits --------
- max number of segments = 4096
- max seg size (kbytes) = 67108864
- max total shared memory (kbytes) = 17179869184
- min seg size (bytes) = 1
- ------ Semaphore Limits --------
- max number of arrays = 1024
- max semaphores per array = 250
- max semaphores system wide = 32000
- max ops per semop call = 100
- semaphore max value = 32767
- ------ Messages: Limits --------
- max queues system wide = 16384
- max size of message (bytes) = 65536
- default max size of queue (bytes) = 65536
至此,調整完成,一般OMSA會恢復正常。如果沒有恢復,重啓Data Engine即可。
- /etc/init.d/dataeng restart