Oracle/백업&복구

2012.02.08 백업&복구 5번째(undo tablespace 장애 복구하기) ; undo 장애처리의 핵심설명, 예제2개

에몽이ㅋ 2012. 2. 12. 17:35
http://gyh214.tistory.com/77 undo segment 관련 참고글

undo segment 운영방법에는 2가지 모드가 있습니다.
1. AUM(Automatic Undo Management)
2. MUM(Manual)


undo tablespace장애 복구의 핵심
1) DB셋팅을 Manual로 바꾸고 복구하기
; AUM 모드로 운영하게 되면, undo tbs가 장애가 생기는 즉시, DB가 뻗어버리고, open조차 되지 않는 현상이 발생하는데 반해,
MUM 모드로 운영하게 되면, oracle에서 undo 쪽에 문제가 발생해도, DBA가 해결하겠지~ 하면서 별 신경을 쓰지 않습니다.
( 물론 장애가 발생한 상태로 DB를 운영하게 되면 DB가 뻗겠죠, MUM으로 설정하면 OPEN까지는 갈 수 있습니다. )
 

** undo_management는 OPEN상태로 바로 바뀌지 않기 때문에 manual로 바꾸려면, parameter를 변경하고 재시작해야합니다.
undo_management = manual

--> undo tbs가 어떤 셋팅을 하고 있는지 보기
SQL> show parameter undo
SQL> show parameter undo

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
undo_management                      string      AUTO
undo_retention                       integer     900
undo_tablespace                      string      UNDOTBS1

2) undo tbs drop 실패할 시
undo tbs안의 어떤 undo segment가 사용중임을 확인할 줄 알고, 강제로 offline시키는 법 알기

; undo 장애로 DB가 다운되고 난 후 MUM모드로 open해서, 새로운 undo tbs 생성 후 디폴트로 지정하고, 예전에 있던 undo tbs를 삭제하려고 하면 안되는 경우가 대부분입니다.
그 이유는, undo tbs안에 보통 10개의 undo segment가 있는데 DB가 다운되기 전 사용자들이 사용하던 undo 데이터들이 undo segment에 아직 들어가 있다고 oracle에서 인식하기 때문입니다. 이 때는 어떤 undo segment사용 중인지를 확인하고, 강제로 offline을 시킨 후 DB를 다시 open시키고 예전 undo를 drop해야합니다.

어떤 undo segment가 사용되고 있는지 확인하기
set line 130
col rollback_seg for a15
col username for a20
select s.sid, s.serial#, s.username, r.name rollback_seg
from v$session s, v$transaction t, v$rollname r
where s.taddr=t.addr
and t.xidusn = r.usn

SQL> l
  1  select s.sid, s.serial#, s.username, r.name rollback_seg
  2  from v$session s, v$transaction t, v$rollname r
  3  where s.taddr=t.addr
  4* and t.xidusn = r.usn
SQL> /

       SID    SERIAL# USERNAME             ROLLBACK_SEG
---------- ---------- -------------------- ---------------
       145         76 SCOTT                _SYSSMU11$  <--- 현재 사용중인 undo segment입니다.


위 쿼리를 사용하시던지 OPEN상태에서는 select segment_name, owner, tablespace_name, status from dba_rollback_segs
이 쿼리를 사용해도 됩니다.

undo segment강제로 offline시키는 parameter
* DB shutdown하신 후 parameter에서 해당 파라미터를 추가한 후 OPEN하시면 해당 undo segment가 offline이 됩니다.

_offline_rollback_segments=( undo segment 이름 )
예제 : _offline_rollback_segments=(_SYSSMU11$)

***********************************************************************************
*********************************************************************************** 
*********************************************************************************** 
UNDO 관련장애 시나리오

UNDO TBS 장애 복구하기 1 : 운영중인 undo tablespace 장애 복구하기
상황 : DB운영 중 undo tablespace 가 장애가 발생했다는 긴급연락을 받고, 새로운 undo tablespace를 생성해서 디폴트로 변경했습니다.
그리고 예전의 undo tablespace 를 삭제를 하기원합니다.

문제 : 예전 undo tablespace에 기존 transaction이 있어서, rollback segment가 active인 상태여서 삭제가 되지 않음

1. 상황확인하기
  1  select a.tablespace_name, a.file_name, b.status
  2  from dba_data_files a, dba_tablespaces b
  3  where a.tablespace_name = b.tablespace_name
  4* order by 1,2
SQL> /

TABLESPACE_NAME                FILE_NAME                                     STATUS
------------------------------ --------------------------------------------- ---------
EXAMPLE                        /home/oracle/oradata/testdb/example01.dbf     ONLINE
SYSAUX                         /home/oracle/oradata/testdb/sysaux01.dbf      ONLINE
SYSTEM                         /home/oracle/oradata/testdb/system01.dbf      ONLINE
UNDOTBS1                       /home/oracle/oradata/testdb/undotbs01.dbf     ONLINE
USERS                          /home/oracle/oradata/testdb/users01.dbf       ONLINE
------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------
SQL> show parameter undo

NAME                                 TYPE       VALUE
------------------------------------ ---------- ------------------------------
undo_management                      string     AUTO
undo_retention                       integer    900
undo_tablespace                      string     UNDOTBS1
------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------
SQL> l
  1  select s.sid, s.serial#, s.username, r.name rollback_seg
  2  from v$session s, v$transaction t, v$rollname r
  3  where s.taddr=t.addr
  4* and t.xidusn = r.usn
SQL> /

       SID    SERIAL# USERNAME             ROLLBACK_SEG
---------- ---------- -------------------- ---------------
       145         76 SCOTT                _SYSSMU11$
------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------

2. 장애발생
SQL> !rm -v /home/oracle/oradata/testdb/undotbs01.dbf
removed `/home/oracle/oradata/testdb/undotbs01.dbf'

SQL> !rm -v /home/oracle/oradata/testdb/undotbs01.dbf
rm: cannot remove `/home/oracle/oradata/testdb/undotbs01.dbf': 그런 파일이나 디렉토리가 없음


3. 긴급연락 받은 후 빠른 조치(undo tbs생성 후 디폴트 undo tbs변경)
SQL> create undo tablespace undo_temp
  2  datafile '/home/oracle/oradata/testdb/undo_temp01.dbf' size 10M;

Tablespace created.

SQL> alter system set undo_tablespace=undo_temp;

System altered.
(해당 DB는 pfile로 운영중이라서, SCOPE=MEMOEY입니다.
Sun Feb 12 17:52:30 KST 2012
ALTER SYSTEM SET undo_tablespace='UNDO_TEMP' SCOPE=MEMORY; <--- alert log file)

4. 기존에 있던 undo tbs 삭제시도
SQL> drop tablespace undotbs1 including contents and datafiles;
drop tablespace undotbs1 including contents and datafiles
*
ERROR at line 1:
ORA-30013: undo tablespace 'UNDOTBS1' is currently in use  <-- drop tbs안됨

5. 예전 undo TBS를 지우기 위해 DB 끈 후,
파라미터파일을 변경하고, mount단계에서 예전 undo tbs를 offline drop하기

**파라미터파일변경사항
undo_tablespace=새로 생성한tbs, undo_management=manual, <--- 변경
_offline_rollback_segments=( active rollback segment name ) <--- 라인추가

SQL> shutdown immediate
ORA-01116: error in opening database file 2
ORA-01110: data file 2: '/home/oracle/oradata/testdb/undotbs01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3
SQL> shutdown abort
ORACLE instance shut down.

SQL> !vi $ORACLE_HOME/dbs/inittestdb.ora
*.undo_management='manual'   << 라인변경
*.undo_tablespace='UNDO_temp'   << 라인변경
_offline_rollback_segments=(_SYSSMU11$)  << 라인추가

SQL> startup
ORACLE instance started.

Total System Global Area  167772160 bytes
Fixed Size                  1272600 bytes
Variable Size              96470248 bytes
Database Buffers           67108864 bytes
Redo Buffers                2920448 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 2 - see DBWR trace file
ORA-01110: data file 2: '/home/oracle/oradata/testdb/undotbs01.dbf'

SQL> alter database datafile '/home/oracle/oradata/testdb/undotbs01.dbf' offline drop;

Database altered.

SQL> alter database open;

Database altered.

SQL> ed
Wrote file afiedt.buf

  1  select tablespace_name, file_name, online_status
  2  from dba_data_files
  3* order by 1,2
SQL> /

TABLESPACE_NAME                FILE_NAME                                     ONLINE_
------------------------------ --------------------------------------------- -------
EXAMPLE                        /home/oracle/oradata/testdb/example01.dbf     ONLINE
SYSAUX                         /home/oracle/oradata/testdb/sysaux01.dbf      ONLINE
SYSTEM                         /home/oracle/oradata/testdb/system01.dbf      SYSTEM
UNDOTBS1                       /home/oracle/oradata/testdb/undotbs01.dbf     RECOVER
UNDO_TEMP                      /home/oracle/oradata/testdb/undo_temp01.dbf   ONLINE
USERS                          /home/oracle/oradata/testdb/users01.dbf       ONLINE

6 rows selected.

SQL> show parameter undo

NAME                                 TYPE       VALUE
------------------------------------ ---------- ------------------------------
undo_management                      string     MANUAL
undo_retention                       integer    900
undo_tablespace                      string     UNDO_temp

SQL> drop tablespace undotbs1 including contents and datafiles;  << 드랍합니다

Tablespace dropped.

6. UNDO 설정을 예전처럼해주기 위해 shutdown 한 후 parameter 변경하기
(undo_management 변경, 히든파라미터 삭제)

SQL> shutdown
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> !vi $ORACLE_HOME/dbs/inittestdb.ora
undo_management='AUTO'
_offline_rollback_segs 라인삭제(or 주석처리)

SQL> startup
ORACLE instance started.

Total System Global Area  167772160 bytes
Fixed Size                  1272600 bytes
Variable Size              96470248 bytes
Database Buffers           67108864 bytes
Redo Buffers                2920448 bytes
Database mounted.
Database opened.
장애처리 1 완료!


UNDO TBS 장애 복구하기 2 :
No archive log mode에서 undo tbs의 datafile이 삭제되고, DB가 강제로 shutdown상태


1. 현재 상황확인 및 장애발생 되어 DB 강제로 꺼짐
SQL> show parameter undo

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
undo_management                      string      AUTO
undo_retention                       integer     900
undo_tablespace                      string      UNDO_TEMP

SQL> archive log list
Database log mode              No Archive Mode
Automatic archival             Disabled
Archive destination            /backup/arc
Oldest online log sequence     336
Current log sequence           338

SQL> !rm /home/oracle/oradata/testdb/undo_temp01.dbf

SQL>  !rm /home/oracle/oradata/testdb/undo_temp01.dbf
rm: cannot remove `/home/oracle/oradata/testdb/undo_temp01.dbf': 그런 파일이나 디렉토리가 없음

SQL> create table scott.undo_test1(no number) tablespace example;

Table created.

SQL> insert into scott.undo_test1 values(1);

1 row created.

SQL> alter system switch logfile;

System altered.

SQL> /

System altered.

SQL> /
alter system switch logfile
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel  << Instance Crash



2. startup 시도
SQL> startup
ORA-24324: service handle not initialized
ORA-01041: internal error. hostdef extension doesn't exist
SQL> quit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@server15 ~]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Sun Feb 12 19:03:19 2012

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area  167772160 bytes
Fixed Size                  1272600 bytes
Variable Size              96470248 bytes
Database Buffers           67108864 bytes
Redo Buffers                2920448 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 6 - see DBWR trace file
ORA-01110: data file 6: '/home/oracle/oradata/testdb/undo_temp01.dbf'


3. 복구를 위해 셧다운 후 파라미터파일에 manual모드로 변경
shutdown 한 후
vi $ORACLE_HOME/dbs/inittestdb.ora  << 파라미터파일
undo_management='manual'

4. MOUNT단계까지 올라간 후 장애난 undo tbs 를 offline drop한 후 open하고
(undo_management가 auto면 OPEN이 불가능)

  --> 이후에 offline할 rollback segment들을 확인한 후
        셧다운 후 파라미터에 적어서 강제적으로 offline한 후 다시 OPEN

SQL> startup
ORACLE instance started.

Total System Global Area  167772160 bytes
Fixed Size                  1272600 bytes
Variable Size              96470248 bytes
Database Buffers           67108864 bytes
Redo Buffers                2920448 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 6 - see DBWR trace file
ORA-01110: data file 6: '/home/oracle/oradata/testdb/undo_temp01.dbf'


SQL> alter database datafile '/home/oracle/oradata/testdb/undo_temp01.dbf' offline drop;

Database altered.

SQL> alter database open;

Database altered.

SQL> select name, status from v$datafile;

NAME                                          STATUS
--------------------------------------------- -------
/home/oracle/oradata/testdb/system01.dbf      SYSTEM
/home/oracle/oradata/testdb/sysaux01.dbf      ONLINE
/home/oracle/oradata/testdb/users01.dbf       ONLINE
/home/oracle/oradata/testdb/example01.dbf     ONLINE
/home/oracle/oradata/testdb/undo_temp01.dbf   RECOVER

SQL> select segment_name, owner, tablespace_name, status from dba_rollback_segs;

SEGMENT_NAME                   OWNER  TABLESPACE_NAME                STATUS
------------------------------ ------ ------------------------------ ----------------
SYSTEM                         SYS    SYSTEM                         ONLINE
_SYSSMU14$                     PUBLIC UNDO_TEMP                      NEEDS RECOVERY
_SYSSMU15$                     PUBLIC UNDO_TEMP                      NEEDS RECOVERY
_SYSSMU16$                     PUBLIC UNDO_TEMP                      NEEDS RECOVERY
_SYSSMU17$                     PUBLIC UNDO_TEMP                      NEEDS RECOVERY
_SYSSMU18$                     PUBLIC UNDO_TEMP                      NEEDS RECOVERY
_SYSSMU19$                     PUBLIC UNDO_TEMP                      NEEDS RECOVERY
_SYSSMU20$                     PUBLIC UNDO_TEMP                      NEEDS RECOVERY
_SYSSMU21$                     PUBLIC UNDO_TEMP                      NEEDS RECOVERY
_SYSSMU22$                     PUBLIC UNDO_TEMP                      NEEDS RECOVERY

SQL> shutdown
Database closed.
Database dismounted.
ORACLE instance shut down.

SQL> !vi $ORACLE_HOME/dbs/inittestdb.ora
_offline_rollback_segments=(_SYSSMU14$, _SYSSMU15$, _SYSSMU16$, _SYSSMU17$, _SYSSMU18$, _SYSSMU19$, _SYSSMU20$, _SYSSMU21$, _SYSSMU22$)  << 라인추가


5. OPEN 후 새로운 undo tablespace를 만들어주고, 기존의 undo tablespace는 삭제
SQL> startup
ORACLE instance started.

Total System Global Area  167772160 bytes
Fixed Size                  1272600 bytes
Variable Size              96470248 bytes
Database Buffers           67108864 bytes
Redo Buffers                2920448 bytes
Database mounted.
Database opened.

SQL> show parameter undo

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
undo_management                      string      MANUAL
undo_retention                       integer     900
undo_tablespace                      string      UNDO_TEMP  <<장애있는 undo

SQL> create undo tablespace undo
  2  datafile '/home/oracle/oradata/testdb/undo01.dbf' size 10M autoextend on maxsize 100M;

Tablespace created.

SQL> drop tablespace undo_temp including contents and datafiles;

Tablespace dropped.


6. 셧다운 후 새 undo 를 디폴트로 지정하고, 원래 설정으로 돌아간 후 startup
(파라미터파일수정 undo_tablespace, undo_management 라인 변경
_offline_rollback_segments라인 주석처리or 삭제)

 !vi $ORACLE_HOME/dbs/inittestdb.ora
*.undo_management='auto'
*.undo_tablespace='UNDO'
#_offline_rollback_segments=(_SYSSMU14$, _SYSSMU15$, _SYSSMU16$, _SYSSMU17$, _SYSSMU18$, _SYSSMU19$,
 _SYSSMU20$, _SYSSMU21$, _SYSSMU22$) <<주석처리됨

 


7. 확인
SQL> startup
ORACLE instance started.

Total System Global Area  167772160 bytes
Fixed Size                  1272600 bytes
Variable Size              96470248 bytes
Database Buffers           67108864 bytes
Redo Buffers                2920448 bytes
Database mounted.
Database opened.
SQL> show parameter undo

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
undo_management                      string      AUTO   <<
undo_retention                       integer     900
undo_tablespace                      string      UNDO  <<

SQL> select tablespace_name, file_name, online_status from dba_data_files;

TABLESPACE_NAME                FILE_NAME                                     ONLINE_
------------------------------ --------------------------------------------- -------
EXAMPLE                        /home/oracle/oradata/testdb/example01.dbf     ONLINE
USERS                          /home/oracle/oradata/testdb/users01.dbf       ONLINE
SYSAUX                         /home/oracle/oradata/testdb/sysaux01.dbf      ONLINE
UNDO                           /home/oracle/oradata/testdb/undo01.dbf        ONLINE
SYSTEM                         /home/oracle/oradata/testdb/system01.dbf      SYSTEM
장애처리 2 완료!