Linux SysAdmin and Oracle/MySQL DBA: Oracle database recovery procedures

This blog is the summary of the article 'Recovering Oracle' at http://examples.oreilly.com/9781565926424/oracle.html, It's very good article regarding how Oracle perform recovery internally.

----
How Oracle select/update statement works: - from spotlight on Oracle Getting started pdf document from Quest software:

The numbers indicate the order of the flow of information.

The numbered labels in the Oracle architecture diagram correspond to the following activities:

1. The client program (for example, SQL*PLUS, Oracle Power Objects, or some other tool)

sends a SELECT statement to the server process.

2. The server process looks in the shared pool for a matching SQL statement. If none is

found, the server process parses the SQL and inserts the SQL statement into the

shared pool.

3. The server process looks in the buffer cache for the data blocks required. If found, the

data block must be moved on to the most recently used end of the Least Recently

Used (LRU) list.

4. If the block cannot be found in the buffer cache the server process must fetch it from the

disk file. This requires a disk I/O.

5. The server process returns the rows retrieved to the client process. This may involve some

network or communications delay.

6. When the client issues the UPDATE statement the process of parsing the SQL and

retrieving the rows to be updated must occur. The update statement then changes the

relevant blocks in shared memory and updates entries in the rollback segment buffers.

7. The update statement also makes an entry in the redo log buffer that records the

transaction details.

8. The database-writer background process copies modified blocks from the buffer cache to

the database files. The Oracle session performing the update does not have to wait for

this to occur.

9. When the COMMIT statement is issued the log writer process must copy the contents of

the redo log buffer to the redo log file. The COMMIT statement does not return control to

the Oracle session issuing the commit until this write is complete.

10. If running in ARCHIVELOG mode, the archiver process copies full redo logs to the

archive destination. A redo log is not eligible for re-use until it has been archived.

11. At regular intervals, or when a redo log switch occurs, Oracle performs a checkpoint. A

checkpoint requires all modified blocks in the buffer cache to be written to disk. A redo

log file cannot be re-used until the checkpoint completes.

Best Practise:

Controlfile:

multiplex them in different places
enable rman auto backup settings
enable rman daily backup to trace file
cannot use a controlfile older than database datafile, in this case, you need to restore both auto controlfile backup and database files from rman backup

Need media recovery if you restore a backup control file (even if all datafiles are current). Refer to http://docs.oracle.com/cd/B19306_01/backup.102/b14192/intro004.htm

only when datafile and redo log are okay, then you can recreate controlfile from 'backup to trace' script.

redo logs

multiplex members in each redo groups
non-active redo log damage is not critical, can be easily repaired.
active or current redo log group damage will cause data lost

An active log group is usually the log group that Oracle just finished writing to. However, until a checkpoint occurs, this group is still needed for media recovery

You can run command 'select group#, status from v$log;' and 'select group#,member from v$logfile;' on mounted and closed database to find out status of redo log groups. As well as command 'select * from v$log_history;'

SPFILE/PFILE

enable daily rman pfile backup

sql "CREATE PFILE=''/u01/app/backup/spfile.backup'' FROM SPFILE";

COMMAN RECOVERY LINKS

Database Recovery and use previous incarnation for rman recovery:

http://www.datadisk.co.uk/html_docs/oracle/db_recovery.htm

Recovering non-critical files

Recovering non-critical files would include temporary tablespaces, redo log files, index tablespaces, read-only tablespace, password file. - http://www.datadisk.co.uk/html_dos/oracle/recover_noncritical_files.htm

Recovering critical files
http://www.datadisk.co.uk/html_docs/oracle/recover_critical_files.htm

RMAN commands summary:
http://www.datadisk.co.uk/html_docs/oracle/rman.htm

Recover from user error as well as flashback architecture:
http://www.datadisk.co.uk/html_docs/oracle/recover_user_errors.htm
http://www.datadisk.co.uk/html_docs/oracle/flashback.htm

Part I - controlfile damage - mount will fail
1. startup mount and controlfile
mounting a database (without opening it) reads the control files but does not open the data files. If the control files are mirrored, Oracle attempts to open each of the control files that are listed in the initORACLE_SID.ora pfile or spfile. If any of them is damaged, the mount fails.

Identify which controlfile is damaged

make backup copy of the whole database first, including all the control files, online redo logs and datafiles or at least all controlfiles and online redo logs
check message in $ORACLE_BASE/ORACLE_SID/admin/bdump/alert_ORACLE_SID.log to find out which controlfile is damaged.
check pfile/spfile control_files line to find out name of controlfiles, list each of them and compare their size and modification time.

1. if the file that Oracle is complaining about is just missing, it's easy. Just shutdown database, copy a good controlfile to overwrite corrupted one, then startup mount again.

2. if damaged file is not missing but corrupted, solution is to one at a time, try copying every version of each control file to all the other locations, then attempt to mount database again, excluding the one that Oracle has already complaining about since it's obviously damaged.

3. if all control files are damaged, we can either

recreate it from control file creation script if online redo logs are okay, so we won't have data lost, provide we perform backup control file to trace regularly, see below for details

The data files can be older versions that were restored from backup, since they will be rolled forward by the media recovery.

However, the online redo logs must be current and intact before running "create controlfile" script.

restore controlfile from rman controlfile autobackup

2 rman control file auto backup (backup format that contains 'DBID')
You need to configure controlfile autobackup on so that rman will backup controlfile automatically as format c-DBID-timestamp-qq(QQ is the hex sequence that starts with 00 and has a maximum of FF, Sequence of autobackup, first autobackup of the day would be 01, second is 02, etc etc). You should manually configure controlfile autobackup format although the default value looks like the same as the one we need, otherwise, when rman does backup, the backuped controlfile won't be the format we want , which is c-DBID-timestamp-xx format.

rman target /
rman> show all;
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default
rman> CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F';
rman> show all [removed '# default']
CONFIGURE CONTROLFILE AUTOBACKUP ON;
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F';

when you run rman> backup database plus archivelog; , it will backup controlfile as the format we want as follows:

Starting Control File and SPFILE Autobackup at 04-APR-12
piece handle=/home/oracle/app/oracle/product/11.2.0/dbhome_1/dbs/c-1294051154-20120404-05 comment=NONE
Finished Control File and SPFILE Autobackup at 04-APR-12

note: above c-1294051154-20120404-05 file is backup pieces, not purely controlfile, use 'strings c-1294051154-20120404-05 | head -10' to verify it.

You can also use "backup current controlfile [format '/tmp/controlfile']" in rman to just backup conrolfile to binary format.

3. backup control file to trace (script to be used to recreate control file)
In order to recreate controlfile, you have to backup it first by using one of the following commands:

Binary format backup:
rman> backup current controlfile [format '/tmp/controlfile'];

Text format script backup:

sql> alter database backup controlfile to trace;
sql> alter database backup controlfile to trace as '/some/path/filename';
sql> alter database backup controlfile to trace as '/some/path/filename' reuse;

Specify REUSE to indicate that existing control files can be reused and overwritten.

You should perform this regularly, either in daily rman backup or separated cronjob as follows:

export ORACLE_SID=livedb
week=`date +%w`
bkp_loc=/tmp

rman target / << END
sql "alter database backup controlfile to trace as ''${bkp_loc}/controlbkp_${ORACLE_SID}_${week}.sql'' reuse";
END

note: red color '' are 2 single quotes. reuse means it's be able to overwrite existing file name.

If you only used 'alter database backup controlfile to trace', then

for 10g, the trace file *.trc will be put in user_dump_dest parameter;
for 11g, query v$diag_info view to get 'diag trace' path , it will be put there.

Refer to http://docs.oracle.com/cd/E14072_01/server.112/e10592/statements_1004.htm (search 'backup controlfile to trace' statement.

4. controlfile script location
firstly, find out where the script is if you have done automatic backup.

sql> alter database backup control file to trace [as '/tmp/ctl.sql' reuse];

$ cd $ORACLE_HOME/dbs; grep user_dump_dest *
let's say the udump directory is /db/Oracle/admin/crash/udump, then run
$ cd /db/Oracle/admin/crash/udump ; grep 'CREATE CONTROLFILE' * |awk -F: '{print $1}'| xargs ls -ltr

4.1 noresetlog case (good redo logs)
modify backuped controlfile to trace as follows:

STARTUP NOMOUNT
CREATE CONTROLFILE REUSE DATABASE "ORCL" NORESETLOGS ARCHIVELOG
MAXLOGFILES 16
MAXLOGMEMBERS 3
MAXDATAFILES 100
MAXINSTANCES 8
MAXLOGHISTORY 292
LOGFILE
GROUP 1 '/home/oracle/app/oracle/oradata/orcl/redo01.log' SIZE 50M BLOCKSIZE 512,
GROUP 2 '/home/oracle/app/oracle/oradata/orcl/redo02.log' SIZE 50M BLOCKSIZE 512,
GROUP 3 '/home/oracle/app/oracle/oradata/orcl/redo03.log' SIZE 50M BLOCKSIZE 512
DATAFILE
'/home/oracle/app/oracle/oradata/orcl/system01.dbf',
'/home/oracle/app/oracle/oradata/orcl/sysaux01.dbf',
'/home/oracle/app/oracle/oradata/orcl/undotbs01.dbf',
'/home/oracle/app/oracle/oradata/orcl/users01.dbf',
'/home/oracle/app/oracle/oradata/orcl/example01.dbf'
CHARACTER SET AL32UTF8
;
VARIABLE RECNO NUMBER;
EXECUTE :RECNO := SYS.DBMS_BACKUP_RESTORE.SETCONFIG('CONTROLFILE AUTOBACKUP','ON');
RECOVER DATABASE
ALTER SYSTEM ARCHIVE LOG ALL;
ALTER DATABASE OPEN;
ALTER TABLESPACE TEMP ADD TEMPFILE '/home/oracle/app/oracle/oradata/orcl/temp01.dbf'
SIZE 30408704 REUSE AUTOEXTEND ON NEXT 655360 MAXSIZE 32767M;

4.2 with resetlog case as follows: (redo log damaged)

STARTUP NOMOUNT
CREATE CONTROLFILE REUSE DATABASE "ORCL" RESETLOGS ARCHIVELOG
MAXLOGFILES 16
MAXLOGMEMBERS 3
MAXDATAFILES 100
MAXINSTANCES 8
MAXLOGHISTORY 292
LOGFILE
GROUP 1 '/home/oracle/app/oracle/oradata/orcl/redo01.log' SIZE 50M BLOCKSIZE 512,
GROUP 2 '/home/oracle/app/oracle/oradata/orcl/redo02.log' SIZE 50M BLOCKSIZE 512,
GROUP 3 '/home/oracle/app/oracle/oradata/orcl/redo03.log' SIZE 50M BLOCKSIZE 512
DATAFILE
'/home/oracle/app/oracle/oradata/orcl/system01.dbf',
'/home/oracle/app/oracle/oradata/orcl/sysaux01.dbf',
'/home/oracle/app/oracle/oradata/orcl/undotbs01.dbf',
'/home/oracle/app/oracle/oradata/orcl/users01.dbf',
'/home/oracle/app/oracle/oradata/orcl/example01.dbf'
CHARACTER SET AL32UTF8
;
VARIABLE RECNO NUMBER;
EXECUTE :RECNO := SYS.DBMS_BACKUP_RESTORE.SETCONFIG('CONTROLFILE AUTOBACKUP','ON');
RECOVER DATABASE USING BACKUP CONTROLFILE
ALTER DATABASE OPEN RESETLOGS;
ALTER TABLESPACE TEMP ADD TEMPFILE '/home/oracle/app/oracle/oradata/orcl/temp01.dbf'
SIZE 30408704 REUSE AUTOEXTEND ON NEXT 655360 MAXSIZE 32767M;

note: Specify RESETLOGS if you want Oracle to ignore the contents of the files listed in the LOGFILE clause

4.3 When to use above script to recreate controlfile?
Use this create controlfile command to re-create a control file only if:

All copies of your existing control files have been lost through media failure.
You want to change the name of the database.( use 'set database')
You want to change the maximum number of redo log file groups, redo log file
members, archived redo log files, datafiles, or instances that can concurrently
have the database mounted and open.

4.4 how to recreate it?

$ sqlplus / as sysdba @/tmp/ctl.sql

5. pfile/spfile parameters : control_files and CONTROL_FILE_RECORD_KEEP_TIME
CONTROL_FILE_RECORD_KEEP_TIME specifies the minimum number of days before a reusable record in the control file can be reused. In the event a new record needs to be added to a reusable section and the oldest record has not aged enough, the record section expands. If this parameter is set to 0, then reusable sections never expand, and records are reused as needed.

6. References:
1. controlfile and datafile header structure:

controlfile structure:
select * from V$CONTROLFILE_RECORD_SECTION

datafile header:
select * from v$datafile_header
2. create controlfile script usage -

Part II - redo logs are damaged
Missing member of any online log group
If the redo logs are mirrored, one or more of the mirrored copies are lost, but at least one good copy of each online redo log remains, Oracle will open the database without any errors displayed to the terminal. The only error will be a message like the following one in the alert log:
Errors in file /db/Oracle/admin/crash/bdump/crash_lgwr_10302.trc:
ORA-00313: open failed for members of log group 2 of thread 1

All members of any online log group are corrupted
However, if all members of any online log group are corrupted, Oracle will complain and the database will not open. The error might look something like this:
ORA-00327: log 2 of thread 1, physical size less than needed
ORA-00312: online log 2 thread 1: '/db/Oracle/b/oradata/crash/redocrash02.log'
ORA-00312: online log 2 thread 1: '/db/Oracle/a/oradata/crash/redocrash03.log'

Missing all members of any online log group
A similar problem is if all members of an online log group are missing. Oracle will complain and the database will not open. The error looks something like this:
ORA-00313: open failed for members of log group 2 of thread 1
ORA-00312: online log 2 thread 1: '/db/Oracle/b/oradata/crash/redocrash02.log'
ORA-00312: online log 2 thread 1: '/db/Oracle/a/oradata/crash/redocrash03.log'

1. controlfile and redo logs
The "create controlfile" will be useful only if all the data files and online redo logs are in place. The data files can be older versions that were restored from backup, since they will be rolled forward by the media recovery. However, the online redo logs must be current and intact for the "create controlfile" script to work.

Note: If redo log file lost, consider to use _ALLOW_RESETLOGS_CORRUPTION = TRUE in
spfile/pfile for creating controlfile to work.
ALLOW_RESETLOGS_CORRUPTION=TRUE allows database to open without consistency checks.
This may result in a corrupted database

2. If both controlfile and redo log are damaged:
Firstly, assume we have done rman full database backup:
rman> backup database plus archivelog;

then all controlfiles and all the current redo log group members are damaged, we can do this:

sql> shutdown abort;
rman target /
rman> startup nomount;
rman> restore controlfile from autobackup;
rman> restore database
note: If you recreate controlfile by using script, then it doesn't find rman backup from flash recovery area by Oracle itself, you might need to run " catalog start with 'rman latest backupset path' "

rman> recover database until cancel using backup controlfile;
or
rman> recover database until scn SCNnumber; (prompted from above)
rman> alter database open resetlogs;

2. identify damaged datafile or redo log file
sql > select name from v$datafile;
sql > select group#, member from v$logfile;

The main thing to look for is a missing file or a zero length file. And, each redo log file within a log group should have the same modification time.

Something else to look for is one or more files that have a modification time that is newer than the newest online redo log file. If a data file meets any one of these conditions, it must be restored from backup.

The "create controlfile" script requires that all online redo logs be present. If even only one log group is completely damaged, it will not be useful after recreating controlfile.

3. run controlfile creation script with readonly tablespace

modify backup controlfile to trace result as follows:
---------------
# The following commands will create a new controlfile and use it
# to open the database.
# Data used by the recovery manager will be lost. Additional logs may
# be required for media recovery of offline data files. Use this
# only if the current version of all online logs are available.

STARTUP NOMOUNT
CREATE CONTROLFILE REUSE DATABASE "CRASH" NORESETLOGS ARCHIVELOG
MAXLOGFILES 32 # maximum logfile groups
MAXLOGMEMBERS 2
MAXDATAFILES 30
MAXINSTANCES 8
MAXLOGHISTORY 843
LOGFILE
GROUP 1 '/db/a/oradata/crash/redocrash01.log' SIZE 500K,
GROUP 2 '/db/b/oradata/crash/redocrash02.log' SIZE 500K,
GROUP 3 '/db/c/oradata/crash/redocrash03.log' SIZE 500K
DATAFILE
'/db/a/oradata/crash/system01.dbf',
'/db/a/oradata/crash/rbs01.dbf',
'/db/a/oradata/crash/temp01.dbf',
'/db/a/oradata/crash/tools01.dbf',
'/db/a/oradata/crash/users01.dbf'
;
# Recovery is required if any of the data files are restored backups,
# or if the last shutdown was not normal or immediate.
RECOVER DATABASE
# All logs need archiving and a log switch is needed.
ALTER SYSTEM ARCHIVE LOG ALL;
# Database can now be opened normally.
ALTER DATABASE OPEN;
# Files in read only tablespaces are now named.
ALTER DATABASE RENAME FILE 'MISSING00006' TO '/dba/oradata/crash/test01.dbf';
# Online the files in read only tablespaces.
ALTER TABLESPACE "TEST" ONLINE;

and temp tablespace creation script;
-----------------
sqlplus / as sysdba
sql> startup nomount; or alter datbase close;
sql> @ctl.sql

4 read-only data files
To find out if the control file is valid and has been copied to all of the correct locations, attempt to start up the database with the mount option.

Oracle does not allow read-only data files to be online during a 'recover database using backup controlfile' action. Therefore, if there are any read-only data files, take them offline by running:
alter database data file 'filename' offline;

To find out if there are any read-only data files, issue the following command on the mounted, closed database:
sql > select enabled, name from v$datafile;

5. recover the database using the backup control file.
Try to recover database normally first as it won't hurt anyway.If it doesn't work, then run
sql> recover database using backup controlfile

Oracle will request all archived redo logs since the time of the oldest restored data file
The most efficient way to roll through the archived redo logs is to have all of them sitting uncompressed in the directory that it suggests as the location of the first file apply online redo log.

Once Oracle rolls through all the archived redo logs, it may prompt for the online redo log. It does this by prompting for an archived redo log with a number that is higher than the most recent archived redo log available.

You can use 'strings redo1.log | head -3' to find out sequence number of redo log when recovering the database using a backup control file, it must be opened with the resetlogs option
sql> alter database open resetlogs;

6. recover non-critical(system) tablespace (not system, sysaux, undo tablespaces)

rman> alter tablespace tablespace_name1 offline;

rman> recover tablespace tablespace_name1 ;

rman > alter tablespace tablespace_name1 online;

7. alter database open
Mounting the database only checks the presence and consistency of the control files. If that works, opening the database is the next step. Doing so will check the presence and consistency of all data files, online redo log files, and any rollback segments.

There is only one copy of each data file, unlike online redo logs and control files that can be mirrored. So, statistically speaking, it's easier to lose one data file than to lose all mirrored copies of a log group or all mirrored copies of the control file.

8. damaged redo log groups:
If all members of a log group are damaged, there is a great potential for data loss. The entire database may have to be restored, depending on the status of the log group that was damaged, and the results of some attempts at fixing it.

Part III - data file damaged.
1. error message when data files are damaged.
Damaged rollback segment

If a rollback segment is damaged, the error will be like the following one:ORA-01545: rollback segment 'USERS_RS' specified not availableCannot open database if all rollback segments are not available.

Missing data file

ORA-01157: cannot identify data file 1 - file not foundORA-01110: data file 1: '/db/Oracle/a/oradata/crash/system01.dbf'
Corrupted data file

A corrupted data file can generate a number of different errors. For instance, it may mimic a missing data file:ORA-01157: cannot identify data file 1 - file not foundORA-01110: data file 1: '/db/Oracle/a/oradata/crash/system01.dbf'It may also completely confuse Oracle:ORA-00600: internal error code, arguments: [kfhcfh1_01], [0], [], [], [], [], [], []A corrupted data file may also cause a "failed verification check" error:ORA-01122: database file 1 failed verification checkORA-01110: data file 1: '/db/Oracle/a/oradata/crash/system01.dbf'
ORA-01200: actual file size of 1279 is smaller than correct size of 40960 blocks

9. damanaged undo datafile:
Since Oracle has to open the data files that contain this rollback segment before it can verify that the rollback segment is available, this error will not occur unless a data file has been taken offline. If Oracle encounters a damaged data file (whether or not it contains a rollback segment), it will complain about that data file and abort the attempt to open the database.

Remember that a rollback segment is a special part of a tablespace that stores rollback information. Rollback information is needed in order to undo (or rollback) an uncommitted transaction. Since a crashed database will almost always contain uncommitted transactions, recovering a database with a damaged rollback segment is a little tricky.

As previously mentioned, a damaged data file may be taken offline, but Oracle will not open the database without the rollback segment.

The strategy for dealing with this is to make Oracle believe that the rollback segment doesn't exist. That will allow the database to be brought online. However, there will be transactions that need to be rolled back that require this rollback segment. Since Oracle believes this rollback segment is no longer available, these rollbacks cannot occur. This means that the database may be online, but portions of it will not be available.

Above method is for temporary solution, the proper solution is to restore then recover undo tablespace. then open database 'startup mount' again.

11. media recovery
If any data files are restored from backup, the recover command will be needed. This command uses the archived and online redo logs to "redo" any transactions that have occurred since the time that the backup of a data file was taken. You can recover a complete database, a tablespace, or a data file by issuing the commands recover database, recover tablespace tablespace_name and recover data file data file_name, respectively

sql> startup mount;
sql> recover datafile '/db/Oracle/a/oradata/crash/datafile01.dbf'
or recover [database|tablespace|datafile] '/db/Oracle/a/oradata/crash/datafile01.dbf'

After receiving the recover command, Oracle prompts for the name and location of the first archived redo log that it needs. If that log, and all logs that have been made sinse that log, are online, uncompressed, and in their original location, enter the word AUTO. This tells Oracle to assume that all files that it needs are online. It can therefore automatically roll through each log that it needs.

At some point, it may ask for an archived redo log that is not available. This could mean some of the archived redo logs or online redo logs are damaged. If the file cannot be located or restored, enter CANCEL.

12. system tablespace is damaged
If the damaged file is part of the SYSTEM tablespace, an offline recovery is required.

All other missing data files can be recovered with the database online. Unfortunately, Oracle only complains that the data file is missing -- without saying what kind of data file it is.

You can find out by using this on the mounted, closed database:
sql>select name from v$datafile where status = 'SYSTEM' ;

Unlike other tablespaces, the SYSTEM tablespace must be available in order to open the database. Therefore, if any members of the system tablespace are damaged, they must be restored now. Before doing this, make sure that the database is not open. (It is okay if it is mounted.) To make sure, run the following command on the mounted, closed database:

sql> select status from v$instance;
mounted

Only when it's mounted state, not open, then restore the damaged files from the most recent backup available, after restored, recover tablespace system;
sql> recover tablespace system;

13. Damaged Non-System Data File?
To open a database with a damaged, non-system data file, take the data file offline.
sql> alter database datafile 'filename' offline;

If database is not at archivelog mode, The only thing Oracle does allow is to drop the data file entirely. This means, of course, that the tablespace that contains this file will have to be rebuilt from scratch. This is but one of the many reasons why a production instance should not be operating in no archive log mode.

sql> alter database datafile 'filename' offline drop;

14. bring datafile back on line
First find out which data files were taken offline.
sql>select name from v$datafile where status = 'OFFLINE';

Once the names of the data files that need to be restored are determined, restore them from the latest available backup. Once they are restored, recovery within Oracle can be accomplished in three different ways. These ways vary greatly in complexity and flexibility. Examine the following three media recovery methods and choose whichever one is best for you.

Datafile recovery:
sql> recover datafile 'datafile_name';
sql> alter database datafile 'datafile_name' online;

tablespace recovery:
sql> select file_name, tablespace_name from dba_data_files;
sql> alter tablespace tablespace_name1 offline;
sql> recover tablespace tablespace_name1 ;
sql> alter tablespace tablespace_name2 offline;
sql> recover tablespace tablespace_name2 ;

database recovery:
once database fiels are restored, then
sql> alter database close;
sql> recover database
auto
sql> alter databse open;

To make sure that all tablespaces and data files have been returned to their proper status,
sql> select name, status from v$datafile
sql> select member, status from v$logfile

15. Is There a Damaged Log Group?
When we refer to a damaged log group, we mean that all members of a log group are damaged. If at least one member of a mirrored log group is intact, Oracle opens the database and simply put an error message in the alert log. However, if all members of a log group are damaged, the database will not open, and the error will look something like this:

The first thing that must be determined is the status of the damaged log group. The
three possibilities are current, active, and inactive

sql> select group#, status from v$log;
current log means oracle is currently writing to it.
Active means just finished writing to it, but still needed for media recovery, this is shot period of time you can see this status.
Inactive means oracle will not use it in anyway.

Is the current online log damaged or not?
If the current online log group is damaged, there would be a message like the following when attempting to open the database:
ORA-00313: open failed for members of log group 2 of thread 1
ORA-00312: online log 2 thread 1: '/db/Oracle/b/oradata/crash/redocrash02.log'
ORA-00312: online log 2 thread 1: '/db/Oracle/a/oradata/crash/redocrash03.log'

This is the worst kind of failure to have because there will definitely be data loss. That is because the current online log is required to restart even a fully functioning database. The current control file knows about the current online log and will attempt to use it. The only way around that is to restore an older version of the control file. Unfortunately, you can't restore only the control file because the data files would then be more recent than the control file. The only remaining option is to restore the entire database.

Is an Active online redo log damaged?
Remember that an ACTIVE log is one that is still needed for recovery. The reason that it is still needed is because a checkpoint has not flushed all changes from shared memory to disk. Once that happens, this log will no longer be needed.

The way to attempt to recover from active redo log files damage is to perform a checkpoint. If it is successful, the database should open successfully. To perform a checkpoint, issue the following command on the mounted, closed database:
sql> alter system checkpoint local;

no matter above command is successful or not, try to open database; if not, you have to recover entire database.

Is an inactive online redo log damaged?
In comparison, this one should be a breeze. An INACTIVE log is not needed by Oracle. If it is not needed, simply drop it and add another in its place.

sql> select member from v$logfile where GROUP# = '2 ;
sql> alter database drop logfile group 2 ;
sql> alter database add logfile group 2 ('/logs1redolog01.dbf', '/logs2redolog01.dbf', '/logs4redolog01.dbf') size 500K ;

16. damanaged undo segments which caused some datafile is damaged.

Because of the unique nature of damaged rollback segments, there are two choices for recovery. The first is to get the database open sooner, but that may leave it only partially functional for a longer period of time. The second choice takes a little longer to open the database, but once it is open it will not have data files that are needed for this rollback segment. Which is more important: getting even a partially functional database open as soon as possible, or not opening the database until all rollback segments are available? The latter is more prudent, but the former may be more appropriate to the environment.

a. uncomment in spfile for undo tablespace, then open it temporarily
Simply delete that part of the line in spfile/pfile for undo tablespace part. First, shut down Oracle completely (this includes un-mounting it as well). Then copy and comment the rollback segment line in the initORACLE_SID.ora file

b. recover undo tablespace first then 'startup mount'

The first thing that must be determined is which tablespace the damaged rollback segment is in. Unfortunately, there is no fixed view that contains this information. That means that it will have to be discovered through common sense and deduction. First, remember that this error is not displayed unless a data file has been taken offline. To get a complete list of files that were taken offline, run the following command on a mounted, closed database:

sql> select TS#, name from v$datafile where status = 'OFFLINE' ;
Then find out the name of the tablespace that contains this data file:
sql> select name from v$tablespace where TS# = '5' ;

How do we know which one contains the rollback segment? Unfortunately, there is no way to be sure while the database is closed. That is why it is very helpful to put the rollback segments in dedicated tablespaces that have names that easily identify them as such. It's even more helpful if the data files are named something helpful as well. For example, create a separate tablespace called ROLLBACK_DATA, and call its data files rollback01.dbf, rollback02.dbf, etc. That way, anyone that finds himself in this scenario will know exactly which data files contain rollback data.

17. alter database open resetlogs
This command causes Oracle to open the database after clearing all contents of the online redo log files. Since there is no way to undo this step, it is a good idea to make copies of the online redo log files now.

sql> select member from v$logfile; # to find out name of redo logs
sql> alter database open resetlogs;

18. References:
http://www.orafaq.com/wiki/Oracle_database_Backup_and_Recovery_FAQ

19. RMAN commands:
http://docs.oracle.com/cd/B28359_01/backup.111/b28270/rcmconfb.htm#i1015217
configure backup optimisation on;

When you turn on backup optimization, all backup commands will skip backups of any file if it has not changed and if it has already been backed up to the allocated device type. It includes read only tablespaces.

select property_name, property_value from database_properties where property_name like '%DEFAULT%'

DEFAULT_TEMP_TABLESPACE TEMP
DEFAULT_PERMANENT_TABLESPACE USERS
DEFAULT_EDITION ORA$BASE
DEFAULT_TBS_TYPE SMALLFILE

Linux SysAdmin and Oracle/MySQL DBA

Oracle database recovery procedures

Total Pageviews

Bookmark

Search This Blog

Followers

keywords

All Posts

About me