ORACLE 11G RAC ASM磁盤全部丟失後的恢復

故障描述

(1)存儲故障導致ASM磁盤丟失。

(2)CRS因爲OCR和VOTEDISK的丟失,除了OHAS還聯機外,CLUSTERWARE服務都已經停止


操作步驟

一、恢復OCR和VOTEDISK

(1) 在所有RAC節點上停止CRS服務

[root@node1 ~]crsctl stop has -f

[root@node2 ~]crsctl stop has -f

(2) 在一個節點上以NOCRS方式啓動CRS,此操作會啓動ASM實例。

root@node1 ~]crsctl start crs -excl -nocrs

(3)之前已創建asm磁盤,查看磁盤狀態。

[grid@node1 ~]sqlplus /as sysasm

SQL>select group_number group#, disk_number disk#, OS_MB, state, path, header_status from v$asm_disk order by 1,2;

(4) 創建三個磁盤組,OCR_VOTE給CRS使用,用於存放OCR,VOTEDISK和ASM實例的SPFILE。其餘兩個給ORACLE使用,ASM_DATA用於存放datafile,

controlfile,redolog,spfile;ASM_FRA存放archivelog。

SQL> create diskgroup OCR_VOTE external redundancy

  2  disk 'OCRL:OCR_VOTE1' //OCRL:OCR_VOTE1這是在前一步驟中查看到的磁盤路徑path

  3  disk 'OCRL:OCR_VOTE2'

  4  ATTRIBUTE 'compatible.rdbms' = '11.2','compatible.asm' = '11.2';

Diskgroup created.

SQL> create diskgroup ASM_DATA external redundancy

  2  disk 'OCRL:ASM_DATA1'

  3  disk 'OCRL:ASM_DATA2'

  4  disk 'OCRL:ASM_DATA3'

  5  ATTRIBUTE 'compatible.rdbms' = '11.2','compatible.asm' = '11.2';

Diskgroup created.

SQL> create diskgroup ASM_FRA external redundancy

  2  disk 'OCRL:ASM_FRA1'

  2  disk 'OCRL:ASM_FRA2'

  3  ATTRIBUTE 'compatible.rdbms' = '11.2','compatible.asm' = '11.2';

Diskgroup created.

(5) 準備恢復OCR和VOTEDISK,/etc/oracle/ocr.loc中記錄了OCR路徑,修改ocrconfig_loc的值,以便將OCR恢復到新的磁盤組中。

[root@rac1 ~]# more /etc/oracle/ocr.loc

ocrconfig_loc=+DATA

local_only=FALSE

[root@rac1 ~]# vi /etc/oracle/ocr.loc

ocrconfig_loc=+SYSTEMDG

local_only=FALSE

(6) 恢復OCR

[root@rac1 ~]# ocrconfig -restore /u01/app/11.2.0/grid/cdata/rac-cluster/backup00.ocr

[root@rac1 ~]#

[root@rac1 ~]# ocrcheck

Status of Oracle Cluster Registry is as follows :

         Version                  :          3

         Total space (kbytes)     :     262120

         Used space (kbytes)      :       2840

         Available space (kbytes) :     259280

         ID                       :   59415097

         Device/File Name         :  +SYSTEMDG

                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded

(7) 創建VOTEDISK

[root@rac1 init]# crsctl replace votedisk +OCR_VOTE

Successful addition of voting disk 8ebb7a63accb4fa8bfa7ab65df7a8c8a.

Successfully replaced voting disk group with +OCR_VOTE.

CRS-4266: Voting file(s) successfully replaced

(8)  OCR和VOTEDISK都恢復完成後,重啓CRS到正常模式。

[root@rac1 ~]# crsctl start crs

CRS-4123: Oracle High Availability Services has been started.

[root@rac1 ~]# crsctl check crs (如以下資源沒在線,請稍等或reboot重啓下系統)

CRS-4638: Oracle High Availability Services is online

CRS-4537: Cluster Ready Services is online

CRS-4529: Cluster Synchronization Services is online

CRS-4533: Event Manager is online


二、修改CRS註冊表中相關配置信息

(1) 掛載新的ASM磁盤組

[grid@rac1 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.3.0 Production on Sat Jul 6 00:16:05 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

With the Real Application Clusters and Automatic Storage Management options 

SQL> 

SQL> select name,state from v$asm_diskgroup;

NAME                           STATE

------------------------------ -----------

SYSTEMDG                       MOUNTED

ARCLOGDG                       DISMOUNTED

DATADG                         DISMOUNTED

SQL> alter diskgroup ARCLOGDG,DATADG mount;

Diskgroup altered.

(2) 更改CRS配置文件中數據庫的磁盤組爲DATADG和ARCLOGDG

[root@rac1 ~]# srvctl modify database -d csdb -a "DATADG,ARCLOGDG" 

(3) 禁用並刪除原來的磁盤組DATA

[root@rac1 ~]# srvctl disable diskgroup -g DATA

[root@rac1 ~]# srvctl remove diskgroup -g DATA 

[root@rac1 rac-cluster]# crs_stat -t -v

Name           Type           R/RA   F/FT   Target    State     Host        

----------------------------------------------------------------------

ora....OGDG.dg ora....up.type 0/5    0/     ONLINE    ONLINE    rac1        

ora.DATADG.dg  ora....up.type 0/5    0/     ONLINE    ONLINE    rac1        

ora....ER.lsnr ora....er.type 0/5    0/     ONLINE    ONLINE    rac1        

ora....N1.lsnr ora....er.type 0/5    0/0    ONLINE    ONLINE    rac1        

ora.asm        ora.asm.type   0/5    0/     ONLINE    ONLINE    rac1        

ora.csdb.db    ora....se.type 0/2    0/1    ONLINE    OFFLINE               

ora.cvu        ora.cvu.type   0/5    0/0    ONLINE    ONLINE    rac1        

ora.gsd        ora.gsd.type   0/5    0/     OFFLINE   OFFLINE               

ora....network ora....rk.type 0/5    0/     ONLINE    ONLINE    rac1        

ora.oc4j       ora.oc4j.type  0/1    0/2    ONLINE    ONLINE    rac1        

ora.ons        ora.ons.type   0/3    0/     ONLINE    ONLINE    rac1        

ora....SM1.asm application    0/5    0/0    ONLINE    ONLINE    rac1        

ora....C1.lsnr application    0/5    0/0    ONLINE    ONLINE    rac1        

ora.rac1.gsd   application    0/5    0/0    OFFLINE   OFFLINE               

ora.rac1.ons   application    0/3    0/0    ONLINE    ONLINE    rac1        

ora.rac1.vip   ora....t1.type 0/0    0/0    ONLINE    ONLINE    rac1        

ora.rac2.vip   ora....t1.type 0/0    0/0    ONLINE    ONLINE    rac1        

ora.scan1.vip  ora....ip.type 0/0    0/0    ONLINE    ONLINE    rac1

(4) 在OCR註冊表中修改Oracle數據庫參數文件的位置

[root@rac1 ~]# srvctl modify database -d csdb -p +DATADG/csdb/spfilecsdb.ora


三、恢復數據庫

(1) 查看備份文件路徑和名稱

[root@rac1 ~]# su - oracle

[oracle@rac1 ~]$ 

[oracle@rac1 ~]$ cd /u01/app/oracle/backup

[oracle@rac1 backup]$ ll

total 221796

-rw-r----- 1 oracle asmadmin   5357568 Jul  5 15:19 arc_819991156_9.bk

-rw-r----- 1 oracle asmadmin      2560 Jul  5 15:19 arc_819991158_11.bk

-rw-r----- 1 oracle asmadmin 203104256 Jul  5 15:18 CSDB_819991120_5.bk

-rw-r----- 1 oracle asmadmin  18546688 Jul  5 15:19 ctl_file_0coe04jq_1_1_20130705.ctl

-rw-r----- 1 oracle asmadmin     98304 Jul  5 15:19 spfile_0doe04js_1_1_20130705

[oracle@rac1 backup]$

(2) 創建一個基本的啓動參數文件,以便啓動數據庫到nomout狀態恢復spfile

[oracle@rac1 ~]$ touch /u01/app/oracle/backup/init.ora

[oracle@rac1 ~]$ vi /u01/app/oracle/backup/init.ora

*.db_name='csdb'

*.remote_login_passwordfile='exclusive'

(3) 使用剛創建的參數文件將數據庫啓動到nomount狀態

[oracle@rac1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Sat Jul 6 13:56:06 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount pfile='/u01/app/oracle/backup/init.ora';

ORACLE instance started. 

Total System Global Area  238034944 bytes

Fixed Size                  2227136 bytes

Variable Size             180356160 bytes

Database Buffers           50331648 bytes

Redo Buffers                5120000 bytes

SQL>

(4) 使用RMAN恢復SPFILE到ASM磁盤組DATADG

[oracle@rac1 ~]$ rman target /

Recovery Manager: Release 11.2.0.3.0 - Production on Sat Jul 6 13:59:26 2013

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: CSDB (not mounted) 

RMAN> restore spfile to '+DATADG/csdb/spfilecsdb.ora' from '/u01/app/oracle/backup/spfile_0doe04js_1_1_20130705';

Starting restore at 06-JUL-13

using channel ORA_DISK_1

channel ORA_DISK_1: restoring spfile from AUTOBACKUP /u01/app/oracle/backup/spfile_0doe04js_1_1_20130705

channel ORA_DISK_1: SPFILE restore from AUTOBACKUP complete

Finished restore at 06-JUL-13

(5) 使用恢復後spfile啓動數據庫,並修改control_files,db_recovery_file_dest,log_archive_dest等存在舊路徑的參數值。

[oracle@rac1 ~]$ vi $ORACLE_HOME/dbs/initcsdb1.ora 

SPFILE='+DATADG/csdb/spfilecsdb.ora'

[oracle@rac1 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Sat Jul 6 14:11:58 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

With the Partitioning, Real Application Clusters, OLAP, Data Mining

and Real Application Testing options

SQL> startup nomount force

ORACLE instance started.

Total System Global Area 1653518336 bytes

Fixed Size                  2228904 bytes

Variable Size            1073745240 bytes

Database Buffers          570425344 bytes

Redo Buffers                7118848 bytes

SQL>

SQL> show parameter control_files

NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

control_files                        string      +DATA/csdb/control01.ctl, +DAT

                                                 A/csdb/control02.ctl

SQL>

SQL> alter system set control_files='+DATADG/csdb/control01.ctl','+DATADG/csdb/control02.ctl' scope=spfile

System altered. 

SQL> alter system set db_recovery_file_dest='+DATADG' scope=spfile;

System altered.

SQL> alter system set log_archive_dest_1='LOCATION=+ARCLOGDG' scope=spfile;

System altered.

SQL> startup force nomount;

ORACLE instance started.

Total System Global Area 1653518336 bytes

Fixed Size                  2228904 bytes

Variable Size            1073745240 bytes

Database Buffers          570425344 bytes

Redo Buffers                7118848 bytes

(6) 查看數據庫的DBID

[oracle@rac1 ~]$ strings /u01/app/oracle/backup/CSDB_819991120_5.bk | grep MAXVALUE,

  返回的值類似下面的例子,其中那一竄數字即爲DBID。

...

MAXVALUE, MAXVALUE!

3042905279, MAXVALUE,

3042905279, MAXVALUE,

...

(7) 恢復控制文件到新的ASM磁盤組DATADG

[oracle@rac1 ~]$ rman target /

Recovery Manager: Release 11.2.0.3.0 - Production on Sat Jul 6 14:28:28 2013 

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved. 

connected to target database: CSDB (not mounted)

RMAN> set dbid=3042905279

executing command: SET DBID 

RMAN> restore controlfile from '/u01/app/oracle/backup/ctl_file_0coe04jq_1_1_20130705.ctl'; 

Starting restore at 06-JUL-13

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=18 instance=csdb1 device type=DISK 

channel ORA_DISK_1: restoring control file

channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

output file name=+DATADG/csdb/control01.ctl

output file name=+DATADG/csdb/control02.ctl

Finished restore at 06-JUL-13

(8) 進入SQLPLUS,查看舊數據文件信息

[oracle@rac1 ~]$ sqlplus / as sysdba 

SQL*Plus: Release 11.2.0.3.0 Production on Sat Jul 6 14:41:12 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,

Data Mining and Real Application Testing options 

SQL> alter database mount;

Database altered. 

SQL> col name format a50

SQL> select file#,name from v$datafile;

     FILE# NAME

---------- --------------------------------------------------

         1 +DATA/csdb/datafile/system.260.819979847

         2 +DATA/csdb/datafile/sysaux.261.819979871

         3 +DATA/csdb/datafile/undotbs1.262.819979889

         4 +DATA/csdb/datafile/undotbs2.264.819979905

         5 +DATA/csdb/datafile/users.265.819979913

(9)使用RMAN恢復數據庫

RMAN> run{

2> set newname for datafile 1 to '+DATADG/csdb/datafile/system.260.819979847';

3> set newname for datafile 2 to '+DATADG/csdb/datafile/sysaux.261.819979871';

4> set newname for datafile 3 to '+DATADG/csdb/datafile/undotbs1.262.819979889';

5> set newname for datafile 4 to '+DATADG/csdb/datafile/undotbs2.264.819979905';

6> set newname for datafile 5 to '+DATADG/csdb/datafile/users.265.819979913';

7> restore database;

8> switch datafile all;

9> recover database;

10> }


executing command: SET NEWNAME

released channel: ORA_DISK_1 

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

Starting restore at 07-JUL-13

Starting implicit crosscheck backup at 07-JUL-13

allocated channel: ORA_DISK_1

Crosschecked 10 objects

Finished implicit crosscheck backup at 07-JUL-13

Starting implicit crosscheck copy at 07-JUL-13

using channel ORA_DISK_1

Finished implicit crosscheck copy at 07-JUL-13

searching for all files in the recovery area

cataloging files...

no files cataloged 

using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore

channel ORA_DISK_1: specifying datafile(s) to restore from backup set

channel ORA_DISK_1: restoring datafile 00002 to +DATADG/csdb/datafile/sysaux.261.819979871

channel ORA_DISK_1: restoring datafile 00003 to +DATADG/csdb/datafile/undotbs1.262.819979889

channel ORA_DISK_1: reading from backup piece /u01/app/oracle/backup/CSDB_819991120_6.bk

channel ORA_DISK_1: piece handle=/u01/app/oracle/backup/CSDB_819991120_6.bk tag=ORCL_HOT_DB_BK

channel ORA_DISK_1: restored backup piece 1

channel ORA_DISK_1: restore complete, elapsed time: 00:00:46

channel ORA_DISK_1: starting datafile backup set restore

channel ORA_DISK_1: specifying datafile(s) to restore from backup set

channel ORA_DISK_1: restoring datafile 00001 to +DATADG/csdb/datafile/system.260.819979847

channel ORA_DISK_1: restoring datafile 00004 to +DATADG/csdb/datafile/undotbs2.264.819979905

channel ORA_DISK_1: restoring datafile 00005 to +DATADG/csdb/datafile/users.265.819979913

channel ORA_DISK_1: reading from backup piece /u01/app/oracle/backup/CSDB_819991120_5.bk

channel ORA_DISK_1: piece handle=/u01/app/oracle/backup/CSDB_819991120_5.bk tag=ORCL_HOT_DB_BK

channel ORA_DISK_1: restored backup piece 1

channel ORA_DISK_1: restore complete, elapsed time: 00:00:35

Finished restore at 07-JUL-13

datafile 1 switched to datafile copy

input datafile copy RECID=6 STAMP=820112831 file name=+DATADG/csdb/datafile/system.284.820112797

datafile 2 switched to datafile copy

input datafile copy RECID=7 STAMP=820112831 file name=+DATADG/csdb/datafile/sysaux.282.820112751

datafile 3 switched to datafile copy

input datafile copy RECID=8 STAMP=820112831 file name=+DATADG/csdb/datafile/undotbs1.283.820112751

datafile 4 switched to datafile copy

input datafile copy RECID=9 STAMP=820112831 file name=+DATADG/csdb/datafile/undotbs2.285.820112797

datafile 5 switched to datafile copy

input datafile copy RECID=10 STAMP=820112831 file name=+DATADG/csdb/datafile/users.286.820112797

Starting recover at 07-JUL-13

using channel ORA_DISK_1

starting media recovery

channel ORA_DISK_1: starting archived log restore to default destination

channel ORA_DISK_1: restoring archived log

archived log thread=1 sequence=15

channel ORA_DISK_1: reading from backup piece /u01/app/oracle/backup/arc_819991156_9.bk

channel ORA_DISK_1: piece handle=/u01/app/oracle/backup/arc_819991156_9.bk tag=TAG20130705T151916

channel ORA_DISK_1: restored backup piece 1

channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

archived log file name=+ARCLOGDG/csdb/archivelog/2013_07_07/thread_1_seq_15.256.820112833 thread=1 sequence=15

channel ORA_DISK_1: starting archived log restore to default destination

channel ORA_DISK_1: restoring archived log

archived log thread=2 sequence=2

channel ORA_DISK_1: restoring archived log

archived log thread=1 sequence=16

channel ORA_DISK_1: reading from backup piece /u01/app/oracle/backup/arc_819991156_10.bk

channel ORA_DISK_1: piece handle=/u01/app/oracle/backup/arc_819991156_10.bk tag=TAG20130705T151916

channel ORA_DISK_1: restored backup piece 1

channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

archived log file name=+ARCLOGDG/csdb/archivelog/2013_07_07/thread_2_seq_2.257.820112835 thread=2 sequence=2

channel default: deleting archived log(s)

archived log file name=+ARCLOGDG/csdb/archivelog/2013_07_07/thread_1_seq_15.256.820112833 RECID=5 STAMP=820112832

archived log file name=+ARCLOGDG/csdb/archivelog/2013_07_07/thread_1_seq_16.258.820112835 thread=1 sequence=16

channel default: deleting archived log(s)

archived log file name=+ARCLOGDG/csdb/archivelog/2013_07_07/thread_2_seq_2.257.820112835 RECID=7 STAMP=820112834

channel ORA_DISK_1: starting archived log restore to default destination

channel ORA_DISK_1: restoring archived log

archived log thread=2 sequence=3

channel ORA_DISK_1: reading from backup piece /u01/app/oracle/backup/arc_819991158_11.bk

channel ORA_DISK_1: piece handle=/u01/app/oracle/backup/arc_819991158_11.bk tag=TAG20130705T151916

channel ORA_DISK_1: restored backup piece 1

channel ORA_DISK_1: restore complete, elapsed time: 00:00:01

archived log file name=+ARCLOGDG/csdb/archivelog/2013_07_07/thread_2_seq_3.257.820112837 thread=2 sequence=3

channel default: deleting archived log(s)

archived log file name=+ARCLOGDG/csdb/archivelog/2013_07_07/thread_2_seq_3.257.820112837 RECID=8 STAMP=820112835

unable to find archived log

archived log thread=2 sequence=4

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03002: failure of recover command at 07/07/2013 01:07:16

RMAN-06054: media recovery requesting unknown archived log for thread 2 with sequence 4 and starting SCN of 323980

(10) 更改REDO LOG位置信息

SQL> select member from v$logfile;

MEMBER

--------------------------------------------------------------------------------

+DATA/csdb/redo01.log

+DATA/csdb/redo02.log

+DATA/csdb/redo03.log

+DATA/csdb/redo04.log

SQL> alter database rename file '+DATA/csdb/redo01.log' to '+DATADG/csdb/redo01.log';

Database altered.

SQL> alter database rename file '+DATA/csdb/redo02.log' to '+DATADG/csdb/redo02.log';

Database altered.

SQL> alter database rename file '+DATA/csdb/redo03.log' to '+DATADG/csdb/redo03.log';

Database altered.

SQL> alter database rename file '+DATA/csdb/redo04.log' to '+DATADG/csdb/redo04.log';

Database altered.

(11) 打開數據庫

SQL> alter database open resetlogs;

Database altered.

(12) 更改TEMP表空間文件位置

SQL> select name from v$tempfile;

NAME

--------------------------------------------------------------------------------

+DATA/csdb/tempfile/temp.263.819979895

SQL> alter tablespace temp add tempfile '+DATADG';

Tablespace altered.

SQL> alter tablespace temp drop tempfile '+DATA/csdb/tempfile/temp.263.819979895';

Tablespace altered

四、完成恢復操作

(1) 在其他RAC節點上更改OCR路徑

 [root@rac2 ~]# vi /etc/oracle/ocr.loc

ocrconfig_loc=+SYSTEMDG

local_only=FALSE

(2) 在恢復節點上重啓CRS

[root@rac1 ~]# crsctl stop crs

[root@rac1 ~]# crsctl start has

(3) 在其他節點上啓動CRS

[root@rac2 ~]# crsctl start crs


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章