oracle 12.2 rac權限問題Linux-x86_64 Error: 13: Permission denied

1、故障背景

現場不知道爲啥在rac2節點上執行了chmod 777 /oracle,導致oracle和grid的權限全部變掉了。通過覈查\庫的不正常然後重啓了庫發現庫起不來了。根據現場的描述,目前只更改了rac2節點的權限,沒有更改rac1的節點權限。

2、解決思路

目前有兩種解決思路:

2.1、使用oracle官方方法

正確安裝完集羣軟件後,在$GRID_HOME/crs/utl目錄下會生成兩個文件crsconfig_dirs、crsconfig_fileperms記錄了核心文件和文件夾的權限,恢復也很方便。

使用root用戶執行:

for 11.2

#cd <GRID_HOME>/crs/install/

#./rootcrs.pl -init

for 12c以上

#cd <GRID_HOME>/crs/install/

#./rootcrs.sh -init

2.2、使用操作系統權限設置命令getfacl,setfacl

通過節點rac1的正常環境,獲取對應/oracle目錄的用戶權限,然後在傳到rac2上去做恢復。

本例我們採用本方法。以下是我們找了一套測試rac環境,來測試這種場景下的恢復。

3、測試恢復步驟

3.1、首先備份rac2的目錄權限
root@rac2[/soft/backup]#getfacl -pR /u01/app >/soft/backup/backup.txt

3.2、更改rac2的權限爲777
root@rac2[/soft/backup]#chmod -R 777 /u01/app

3.3、測試停掉rac2上面的數據庫實例然後在起,看能否起來
grid@rac2[/home/grid]$srvctl stop instance -d test -n rac2
grid@rac2[/home/grid]$crsctl stat res -t
ora.test.db
      1        ONLINE  ONLINE       rac1                     Open,HOME=/u01/app/o
                                                             racle/product/12.1.0
                                                             ,STABLE
      2        OFFLINE OFFLINE                               Instance Shutdown,ST
                                                             ABLE
3.4、再次啓動,意料之中的報錯了
grid@rac2[/home/grid]$srvctl start instance -d test -n rac2
PRCR-1013 : Failed to start resource ora.test.db
PRCR-1064 : Failed to start resource ora.test.db on node rac2
CRS-5017: The resource action "ora.test.db start" encountered the following error: 
ORA-00205: error in identifying control file, check alert log for more info
. For details refer to "(:CLSN00107:)" in "/u01/app/grid/grid/diag/crs/rac2/crs/trace/crsd_oraagent_oraclerac.trc".

CRS-2674: Start of 'ora.test.db' on 'rac2' failed
grid@rac2[/home/grid]$

3.5、我們check一下alert日誌
grid@rac2[/home/grid]$locate alert_test2.log
/u01/app/oracle/diag/rdbms/test/test2/trace/alert_test2.log
2019-12-19T16:30:30.255541+08:00
Error attempting to elevate LMS1's priority: no further priority changes will be attempted for this process
.....
Decreasing number of high priority LMS from 2 to 0
2019-12-19T16:33:31.842137+08:00
WARNING: failed to register ASMB0 with ASM instance
2019-12-19T16:33:31.842550+08:00
Errors in file /u01/app/oracle/diag/rdbms/test/test2/trace/test2_asmb_25121.trc:
ORA-01034: ORACLE not available
ORA-27121: unable to determine size of shared memory segment
Linux-x86_64 Error: 13: Permission denied
Additional information: 6761
Additional information: 5996550
Stopping background process RBAL
2019-12-19T16:33:32.844553+08:00
WARNING: ASMB0 exiting with error
2019-12-19T16:33:34.845548+08:00
Starting background process ASMB
2019-12-19T16:33:34.862397+08:00
ASMB started with pid=44, OS id=29615
2019-12-19T16:36:35.670841+08:00
WARNING: failed to register ASMB0 with ASM instance
WARNING: ASMB0 exiting with error
2019-12-19T16:36:35.700823+08:00
Starting background process ASMB
2019-12-19T16:36:35.717018+08:00
ASMB started with pid=44, OS id=14051
2019-12-19T16:39:36.844672+08:00
WARNING: failed to register ASMB0 with ASM instance
WARNING: ASMB0 exiting with error
2019-12-19T16:39:36.848684+08:00
ORA-00210: cannot open the specified control file
ORA-00202: control file: '+DATA/test/control02.ctl'
ORA-17503: ksfdopn:2 Failed to open file +DATA/test/control02.ctl
ORA-15001: diskgroup "DATA" does not exist or is not mounted
ORA-01034: ORACLE not available
ORA-27121: unable to determine size of shared memory segment
Linux-x86_64 Error: 13: Permission denied
Additional information: 6761
Additional information: 5996550
ORA-00210: cannot open the specified control file
ORA-00202: control file: '+DATA/test/control01.ctl'
ORA-17503: ksfdopn:2 Failed to open file +DATA/test/control01.ctl
ORA-15001: diskgroup "DATA" does not exist or is not mounted

ORA-01034: ORACLE not available
ORA-27121: unable to determine size of shared memory segment
Linux-x86_64 Error: 13: Permission denied
Additional information: 6761
Additional information: 5996550
2019-12-19T16:39:36.850963+08:00
ORA-205 signalled during: ALTER DATABASE MOUNT /* db agent *//* {2:18183:4569} */...
2019-12-19T16:39:39.036289+08:00
License high water mark = 2
2019-12-19T16:39:39.036741+08:00
USER (ospid: 18565): terminating the instance
2019-12-19T16:39:40.040061+08:00
Instance terminated by USER, pid = 18565
報了一堆這樣的錯,其實你去檢查磁盤組是mount的和控制文件啥的也都是存在的,有個Linux-x86_64 Error: 13: Permission denied,這個一般就是<ORACLE_HOME>/bin/oracle文件的權限導致的,很明顯這個地方我們改掉了/u01/app下所有的文件權限。

3.6、獲取rac1的權限
我們只改了rac2的/u01/app權限,rac1的權限是正常的,所以我們通過rac1的權限來恢復rac2
root@rac1[/root]#getfacl -pR /u01/app >/soft/backup.txt

3.7、傳到rac2
root@rac1[/root]#getfacl -pR /u01/app >/soft/backup.txt
root@rac1[/root]#scp /soft/backup.txt rac2:/soft/backup_rac1.txt
backup.txt                                                                                      100%   24MB  77.3MB/s   00:00    
root@rac1[/root]#

3.8、替換/soft/backup_rac1.txt中主機名,數據庫名,asm名
sed -i 's/rac1/rac2/g' /soft/backup_rac1.txt
sed -i 's/test1_/test2_/g' /soft/backup_rac1.txt
sed -i 's/ASM1/ASM2/g' /soft/backup_rac1.txt
注意這邊有多少個實例都要改多少個,我們這邊只有一個test庫,所以只需要改test對應的實例,可以通過如下命令獲取有多少個庫
grid@rac1[/home/grid]$srvctl config database
joyce
test
上面獲取到了兩個庫,joyce是rac one node單節點,所以只在rac1上有,rac2上並沒有,不需要改。
root@rac2[/root]#sed -i 's/rac1/rac2/g' /soft/backup_rac1.txt
root@rac2[/root]#sed -i 's/test1_/test2_/g' /soft/backup_rac1.txt
root@rac2[/root]#sed -i 's/ASM1/ASM2/g' /soft/backup_rac1.txt

3.9、使用修改過的權限文件恢復rac2
setfacl --restore=/soft/backup_rac1.txt
該命令遇到文件不存在的會報錯並跳過錯誤繼續執行下面的所有文件,不會終止,所以不用擔心文件不存在報錯的問題。

3.10、檢查關鍵文件權限
oraclerac@rac2[/home/oraclerac]$ll $ORACLE_HOME/bin/oracle
-rwsr-s--x 1 oraclerac oinstall 408857200 Dec 19 12:38 /u01/app/oracle/product/12.1.0/bin/oracle
oraclerac@rac2[/home/oraclerac]$
發現oracle文件權限已經回來了。

3.11、嘗試再次啓動test實例
grid@rac2[/home/grid]$srvctl start instance -d test -n rac2
grid@rac2[/home/grid]$crsctl stat res -t
ora.test.db
      1        ONLINE  ONLINE       rac1                     Open,HOME=/u01/app/o
                                                             racle/product/12.1.0
                                                             ,STABLE
      2        ONLINE  ONLINE       rac2                     Open,HOME=/u01/app/o
                                                             racle/product/12.1.0
                                                             ,STABLE

--------------------------------------------------------------------------------
發現已經可以起來了。

3.12、繼續測試oracle官方提供的方法
停rac2
grid@rac2[/home/grid]$srvctl stop instance -d test -n rac2

3.13、修改rac2上的/u01/app目錄權限
root@rac2[/root]#chmod -R 777 /u01/app

3.14、驗證權限變了
grid@rac2[/home/grid]$ll $ORACLE_HOME/bin/oracle
-rwxrwxrwx 1 grid oinstall 373556656 Nov 13 22:21 /u01/app/grid/product/bin/oracle

3.15、此時起仍然起不來。

3.16、去grid_home/crs/install目錄
cd <GRID_HOME>/crs/install/
root@rac2[/u01/app/grid/product/crs]#cd /u01/app/grid/product/crs/install
root@rac2[/u01/app/grid/product/crs/install]#./rootcrs.sh -init
/u01/app/grid/product/perl/bin/perl: symbol lookup error: /root/perl5/lib/perl5/auto/XML/Parser/Expat/Expat.so: undefined symbol: Perl_xs_apiversion_bootcheck
The command '/u01/app/grid/product/perl/bin/perl -I/u01/app/grid/product/perl/lib -I/u01/app/grid/product/crs/install /u01/app/grid/product/crs/install/rootcrs.pl -init' execution failed
很遺憾,報錯了,這個錯誤是perl的版本問題,搞起來還是比較麻煩的。建議使用上面那種方式。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章