問題重現:
/opt/cloudera/cm/schema/scm_prepare_database.sh -hhadoop01 --scm-host hadoop01 mysql scm scm 123456
執行後數據庫中的scm庫沒有任何表和數據
原因分析
剛開始報這個錯誤,心想增加scm的對hadoop01.xx.com訪問即可
重新執行:
/opt/cloudera/cm/schema/scm_prepare_database.sh -hhadoop01 --scm-host hadoop01 mysql scm scm 123456
執行之後沒有報任務錯誤,但是數據庫中scm沒有初始化一個表,全是空的。
排除原因1:Mysql數據庫連接和版本異常
心裏很納悶,腳本的語法沒寫錯,如果是數據庫連不上那應該報錯啊,但是沒有,於是還是去檢查了一遍,使用mysql -root -p123456可以進入mysql數據庫,於是排除了數據庫連接的問題,那是什麼原因?版本5.6去官網查詢了也是支持的,而且5.6.46是我去mysql官網下載的5.6版本里面最穩定的,結論排除此原因。
排除原因2:官方文檔查詢資料
心裏一直琢磨到底是什麼原因導致,執行沒報異常,但是數據庫裏沒數據。從我的經驗出發遇到這類問題,於是就去翻官網文檔,不得不說cloudea文檔寫得很非常棒
https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/prepare_cm_database.html
此參數配置和官網說得一樣,檢查確認沒問題
排除原因3:腳本日誌和linux系統級日誌
此時有點鬱悶,都沒問題到底是什麼原因??
心中還是隻有一個念頭,就是錯誤日誌信息太少,我想打印些日誌信息也參數配置。於是想到自己的linux系統是CenOS7的。有系統日誌命令,於是打印查看
journalctl -xe |
[root@hadoop01 ~]# journalctl -xe
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Withdrawing address record for 192.168.122.1 on virbr0.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Withdrawing workstation service for virbr0.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Withdrawing address record for fe80::44a6:91bd:cc0a:c335 on ens192.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Withdrawing address record for fe80::3c7e:ac7c:5452:edc6 on ens192.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Host name conflict, retrying with hadoop01-8040
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Registering new address record for 192.168.122.1 on virbr0.IPv4.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Registering new address record for fe80::44a6:91bd:cc0a:c335 on ens192.*.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Registering new address record for fe80::3c7e:ac7c:5452:edc6 on ens192.*.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Registering new address record for fe80::cc18:552d:c34f:830b on ens192.*.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Registering new address record for 192.168.12.101 on ens192.IPv4.
Oct 28 09:19:23 hadoop01 avahi-daemon[11856]: Registering HINFO record with values 'X86_64'/'LINUX'.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing address record for 192.168.122.1 on virbr0.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing address record for fe80::44a6:91bd:cc0a:c335 on ens192.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing address record for fe80::3c7e:ac7c:5452:edc6 on ens192.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering new address record for 192.168.122.1 on virbr0.IPv4.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering new address record for fe80::44a6:91bd:cc0a:c335 on ens192.*.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering new address record for fe80::cc18:552d:c34f:830b on ens192.*.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Server startup complete. Host name is hadoop01-8040.local. Local service cookie is 1979383840
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing address record for fe80::cc18:552d:c34f:830b on ens192.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing address record for 192.168.12.101 on ens192.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing workstation service for ens192.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing workstation service for lo.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing workstation service for virbr0-nic.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing address record for 192.168.122.1 on virbr0.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing workstation service for virbr0.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Withdrawing address record for fe80::44a6:91bd:cc0a:c335 on ens192.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Host name conflict, retrying with hadoop01-8041
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering new address record for 192.168.122.1 on virbr0.IPv4.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering new address record for fe80::44a6:91bd:cc0a:c335 on ens192.*.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering new address record for fe80::3c7e:ac7c:5452:edc6 on ens192.*.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering new address record for fe80::cc18:552d:c34f:830b on ens192.*.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering new address record for 192.168.12.101 on ens192.IPv4.
Oct 28 09:19:43 hadoop01 avahi-daemon[11856]: Registering HINFO record with values 'X86_64'/'LINUX'.
Oct 28 09:20:01 hadoop01 systemd[1]: Started Session 482 of user root.
-- Subject: Unit session-482.scope has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit session-482.scope has finished starting up.
--
-- The start-up result is done.
Oct 28 09:20:01 hadoop01 CROND[56675]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Oct 28 09:20:16 hadoop01 CommAmqpListene[56536]: [CCafException] AmqpComm@[56536]: CommAmqpListener: [CCafException] AmqpCommon::validateSt
lines 987-1029/1029 (END)
突然很高興,因爲這裏說得很清楚,
Host name conflict, retrying with hadoop01-8040
主機名衝突,我配置的是主機名初始化的,於是先改成ip測試下,
同時把這個問題拋給運維人員(此運維人員沒有open的思想,問他了也不說:但是我執行了history去查看歷史的命令他沒有用過,那我估計他虛擬化主機的時候,虛擬機ip的配置及映射有問題)
果然執行之後數據庫裏面的scm庫有數據了,此時笑了出來,媽蛋原來是主機環境問題。
於是回過來把環境都檢查確認了一遍
等運維把主機名衝突的事情解決了,我重新執行了帶主機名的腳本,這下正常了
總結:
下次安裝集羣之前,首先應該把主機環境都確認清楚。不能有一點問題,否則事倍功半。
幹技術要有open的思想,互相學習進步