CloudStack 4.0.2 vRouter導致重啓後狀態不正常

轉載地址:http://www.linuxidc.com/Linux/2013-08/88474.htm

最近總玩CloudStack + KVM,發現在重啓CloudStack服務後,host(kvm)的狀態老是爲alert。日誌裏出現如下錯誤提示:

ERROR [agent.manager.AgentManagerImpl] (AgentManager-Handler-7:) Monitor ClusteredVirtualMachineManagerImpl$$EnhancerByCGLIB$$121cf44e says there is an error in the connect process for 1 due to null
java.lang.NullPointerException
        at com.cloud.vm.VirtualMachineManagerImpl.fullHostSync(VirtualMachineManagerImpl.java:1643)
        at com.cloud.vm.VirtualMachineManagerImpl.processConnect(VirtualMachineManagerImpl.java:2289)
        at com.cloud.agent.manager.AgentManagerImpl.notifyMonitorsOfConnection(AgentManagerImpl.java:605)
        at com.cloud.agent.manager.AgentManagerImpl.handleConnectedAgent(AgentManagerImpl.java:1157)
        at com.cloud.agent.manager.AgentManagerImpl.access$100(AgentManagerImpl.java:142)
        at com.cloud.agent.manager.AgentManagerImpl$AgentHandler.processRequest(AgentManagerImpl.java:1235)
        at com.cloud.agent.manager.AgentManagerImpl$AgentHandler.doTask(AgentManagerImpl.java:1374)
        at com.cloud.agent.manager.ClusteredAgentManagerImpl$ClusteredAgentHandler.doTask(ClusteredAgentManagerImpl.java:618)
        at com.cloud.utils.nio.Task.run(Task.java:83)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:679)

agent日誌出現提示:

2013-08-09 11:27:18,746 INFO  [cloud.agent.Agent] (Agent-Handler-3:null) Reconnecting...
2013-08-09 11:27:18,747 INFO  [utils.nio.NioClient] (Agent-Selector:null) Connecting to 20.1.134.190:8250
2013-08-09 11:27:18,855 INFO  [utils.nio.NioClient] (Agent-Selector:null) SSL: Handshake done
2013-08-09 11:27:19,422 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Proccess agent startup answer, agent id = 1
2013-08-09 11:27:19,422 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Set agent id 1
2013-08-09 11:27:19,423 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Startup Response Received: agent id = 1
2013-08-09 11:27:19,539 WARN  [cloud.agent.Agent] (UgentTask-5:null) Unable to send request: null
2013-08-09 11:27:23,856 INFO  [cloud.agent.Agent] (Agent-Handler-3:null) Connected to the server
2013-08-09 11:27:24,481 INFO  [cloud.agent.Agent] (Agent-Handler-3:null) Lost connection to the server. Dealing with the remaining commands...
2013-08-09 11:27:29,483 INFO  [cloud.agent.Agent] (Agent-Handler-3:null) Reconnecting...
2013-08-09 11:27:29,484 INFO  [utils.nio.NioClient] (Agent-Selector:null) Connecting to 20.1.134.190:8250
2013-08-09 11:27:29,580 INFO  [utils.nio.NioClient] (Agent-Selector:null) SSL: Handshake done
2013-08-09 11:27:30,223 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Proccess agent startup answer, agent id = 1
2013-08-09 11:27:30,224 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Set agent id 1
2013-08-09 11:27:30,225 INFO  [cloud.agent.Agent] (Agent-Handler-2:null) Startup Response Received: agent id = 1
2013-08-09 11:27:30,350 WARN  [cloud.agent.Agent] (UgentTask-5:null) Unable to send request: null
2013-08-09 11:27:34,581 INFO  [cloud.agent.Agent] (Agent-Handler-3:null) Connected to the server
2013-08-09 11:27:35,310 INFO  [cloud.agent.Agent] (Agent-Handler-3:null) Lost connection to the server. Dealing with the remaining commands...

重啓agent、libvirtd服務,異常依然。重啓host,問題還是一樣。

從日誌中能看出,異常是management-server在連接上cloud-agent後,刷新vm狀態時問題導致的。而此時,除了vRouter,所有vm的狀態均爲Stoped。vRouter的狀態缺爲Running,就此找到問題所在。不知何故,在host上使用virsh list並不能看到vRouter,而management-server卻認爲他是Running狀態,需要刷新一下狀態,導致在management-server查詢不到vRouter,所以拋出異常。這應該是一個bug,需要修復。

解決方案,刪除vRoute(需要先在數據庫將狀態置爲Stopped,執行sql  “update vm_instance set state = 'Stopped' where vm_type = 'DomainRouter';”)。


發佈了14 篇原創文章 · 獲贊 8 · 訪問量 10萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章