Storage Performance - IOPS


Jephe Wu - http://linuxtechres.blogspot.com

Objective: understanding IOPS


When we talk about performance of a system, we refer to CPU, Memory, network bandwidth and disk IO (IOPS).

What's IOPS?
IOPS - I/O operation per second, also can be transaction per second(tps)  - http://en.wikipedia.org/wiki/IOPS

Based on http://vmtoday.com/2009/12/storage-basics-part-ii-iops/,

IOPS = 1000/(seek latency + rotational latency)

How to calculate?
IOPS = 1/(seek time + latency of a single disk)
One I/O might consist of a read or a write. the size of one IO is determined by the application.
The IO unit size is decided by application, not storage itself.
Normally, the IO unit size will be 4k or 8k, but can be up to 64k.


The default db_block_size for Oracle on Linux is 8K

Total IOPS = IOPS * Total number of drives
e.g. (the following example is from https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/5/pdf/Configuration_Example_-_Oracle_HA_on_Cluster_Suite/Red_Hat_Enterprise_Linux-5-Configuration_Example_-_Oracle_HA_on_Cluster_Suite-en-US.pdf)


There are 3 main factors in calculating IOPS or I/O's Per Second:
• Rotational Speed – AKA spindle speed (RPM)
• Average Latency – Time for sector being accessed to be under a r/w head
• Average Seek – Time it takes for hard drive's r/w head to position itself over the track to be
read or written.

4 X 1TB 10kRPM SAS (RAID 0)
Avg. Latency = 3ms
Avg. Seek = 4.45ms
1/(.003 + .0045)= 133 IOPS
Total IOPS = 4 * 133 IOPS = 532 IOPS

A high-performance disk is capable of about 50 to 70 random or up to 160 sequential I/O operations per second

How to choose better Disk to achieve high IOPS

  • 15k RPM (spindle)
  •  NCQ (Native Command Queuing) - SATA  or Tagged Queuing (FCP/SAS)


In general a 15K 72GB drive has better performance than a 10K
300GB drive


How to know IO wait (in vmstat) is suitable or not?
vmstat iowait percentage is close to 1/(the number of cpu cores).

How to Check Filesystem Block Size on Linux? -
http://www.planetmy.com/blog/how-to-check-filesystem-block-size-on-linux/
Basically, they are tune2fs, dumpe2fs, blockdev, create a real file and du -sm.


References:



Checking Linux Server Hardware Information


Jephe Wu - http://linuxtechres.blogspot.com


Objective: get most of hardware information without physical checking/opening server
Environment: CentOS 6.1 64bit


Steps:
# check server hardware info
dmidecode
biosdecode
lshw



root@serv01:~/ # which lshw
/usr/sbin/lshw
root@serv01:~/ # rpm -qf /usr/sbin/lshw
lshw-B.02.12.01-1.el5


# check release info and architecture
lsb_release -a
uname -a


# list initrd.img file content 
lsinitrd /boot/initrd.img

[root@db01 boot]# which lsinitrd
/sbin/lsinitrd
[root@db01 boot]# rpm -qf /sbin/lsinitrd
dracut-004-53.el6.noarch


# check cpu
lscpu
cat /proc/cpuinfo

[root@db01 ~]# which lscpu
/usr/bin/lscpu
[root@db01 ~]# rpm -qf /usr/bin/lscpu
util-linux-ng-2.17.2-12.4.el6.x86_64

# check storage
cat /proc/partitions
df -h
multipath -ll
lspci -v
lssd
lssg
lsblk

[root@db01 ~]# which lssd
/usr/bin/lssd
[root@db01 ~]# rpm -qf /usr/bin/lssd
fibreutils-3.2-3.x86_64
[root@db01 ~]# which lssg
/usr/bin/lssg
[root@db01 ~]# rpm -qf /usr/bin/lssg
fibreutils-3.2-3.x86_64


# check additional pci card, which physical slot and VGA card memory etc
lspci -vvv
.....
Physical Slot: 1
.....


# check relationship between files and process
lsof | grep filename


# drivers and modules info
lsmod




# check memory
cat /proc/meminfo
free [-m|-g]


# check boot info
dmesg
less /var/log/messages


# check usb and pcmcia info
lsusb [-v]
lspcmcia

# others
lshal

Basic Oracle DBA Task List

Jephe Wu - http://linuxtechres.blogspot.com

Objective: some basic Oracle DBA daily task scripts and performance checking


Performance Issue: Most of (more 90%) performance issues are caused by Application. Few (less than 10%) of issues are caused by resource limitation, OS/RDBMS configuration and other reasons.

  • CPU bottleneck 
According to page http://www.dba-oracle.com/t_high_cpu.htm

It is not a cause for concern when the user + system CPU values approach 100 percent. This just means that the CPUs are working to their full potential. The only metric that identifies a CPU bottleneck is when the run queue (r value) exceeds the number of CPUs on the server.
  • IO bottleneck
The Wa value of vmstat command indicates how many percent of the CPU time is being used waiting for database I/O. But it does not always indicate I/O bottleneck.
  • Memory bottleneck 
According to page http://www.dba-oracle.com/t_tuning_cpu_usage_vmstat.htm

Oracle parameters such as sort_area_size can greatly increase the amount of RAM allocated to each connected user's PGA.

a UNIX process may page-in when the UNIX program is starting or is accessing parts of its code that it had not used before.

With UNIX virtual memory we are always anticipating running out of RAM memory, and a page-out is a method for being ready for a subsequent page-in

In sum, page-out (po) operations are a normal part of virtual memory operation, but page-in (pi) operations indicate that the server has excessive RAM demands. Because Linux will put least used part of memory back to swap

  • Cronjob scripts 


  1. use vmstat 10 to get data and the following is the script:



-bash-3.2# cat /home/oracle/bin/vmstat.sh 
#!/bin/sh
DATE=`date +%Y%m%d`
function vm {
  while read line
  do
    printf "$line"
    date '+ %m-%d-%Y %H:%M:%S'
  done
}
find /var/adm/logs/vmstat.log* -type f -mtime +30 -exec rm -f {} \;
vmstat 10 | vm >> /var/log/vmstat/vmstat.log.$DATE

then put cronjob as follows to run vmstat 
0 0 * * * sleep 1;kill `ps -ef | grep -v  grep  |grep vmstat.sh | awk '{print $2}'`;sleep 3;nohup /home/oracle/bin/vmstat.sh &

 2.  most used cpu and wait time


#!/bin/sh
export ORACLE_SID=PROD
export ORACLE_BASE=/db01/oracle
export ORACLE_HOME=/db01/oracle/product/11.2.0
PATH=/usr/local/bin:/usr/bin:/etc:/usr/sbin:/usr/ucb:/home/oracle/bin:/usr/bin/X11:/sbin:/db01/oracle/product/11.2.0/bin
LD_LIBRARY_PATH=/db01/oracle/product/11.2.0/lib
export PATH LD_LIBRARY_PATH

DATE=`date +%Y%m%d`
LOGFILE=/var/adm/logs/oracle_perf.log.$DATE

find /var/adm/logs/oracle_perf.log.* -type f -mtime +30 -exec rm -f {} \;
sqlplus -S / as sysdba @/home/oracle/bin/oracle_perf.sql >> $LOGFILE


oracle@prod1 (PROD)> cat /home/oracle/bin/oracle_perf.sql 
set line 32000
select to_char(sysdate, 'YYYYMMDD hh24:mi:ss') tm from dual;

# logon within 4 hours, last half hour, cpu and db time .
SELECT s.sid, s.serial#, p.spid as "OS PID", s.username, s.module, st.value/100 as "DB Time (sec)"
, stcpu.value/100 as "CPU Time (sec)", round(stcpu.value / st.value * 100,2) as "% CPU"
FROM v$sesstat st, v$statname sn, v$session s, v$sesstat stcpu, v$statname sncpu, v$process p
WHERE sn.name = 'DB time' -- CPU
AND st.statistic# = sn.statistic#
AND st.sid = s.sid
AND  sncpu.name = 'CPU used by this session' -- CPU
AND stcpu.statistic# = sncpu.statistic#
AND stcpu.sid = st.sid
AND s.paddr = p.addr
AND s.last_call_et < 1800 -- active within last 1/2 hour
AND s.logon_time > (SYSDATE - 240/1440) -- sessions logged on within 4 hours
AND st.value/100 > 30 order by st.value;

# time waited sort for logon within 4 hours, last half an hour, for db file sequential read time waited , sorting
SELECT s.sid, s.serial#, p.spid as "OS PID", s.username, s.module, se.time_waited
FROM v$session_event se, v$session s, v$process p
WHERE se.event = 'db file sequential read
AND s.last_call_et < 1800 -- active within last 1/2 hour
AND s.logon_time > (SYSDATE - 240/1440) -- sessions logged on within 4 hours
AND se.sid = s.sid
AND s.paddr = p.addr
ORDER BY se.time_waited;
exit;

3. long queries

# long query , run more than 60s, active, not background type.

select s.username,s.type,s.sid,s.serial#,s.last_call_et seconds_running,q.sql_text from v$session s 
join v$sql q
on s.sql_id = q.sql_id
 where status='ACTIVE'
 and type <> 'BACKGROUND'
 and last_call_et> 60
order by sid,serial#




  • 4. checking blocking session.





SQL> select process,sid, blocking_session from v$session where blocking_session is not null;


select
  object_name, 
  object_type, 
  session_id, 
  type,                 -- Type or system/user lock
  lmode,        -- lock mode in which session holds lock
  request, 
  block, 
  ctime                 -- Time since current mode was granted
from
  v$locked_object, all_objects, v$lockwhere
  v$locked_object.object_id = all_objects.object_id AND
  v$lock.id1 = all_objects.object_id AND
  v$lock.sid = v$locked_object.session_idorder by
  session_id, ctime desc, object_name/
SQL> select SERIAL# from v$session where SID=123;
SERIAL#
———-
380

SQL> alter system kill session ’380,123′;

  • 5. long transaction more than 6 seconds



sqlplus / as sysdba
sql> select *from v$session_longops;


select * from (
  select opname, target, sofar, totalwork,
         units, elapsed_seconds, message
  from v$session_longops order by start_time desc)
where rownum <=1;
http://www.gplivna.eu/papers/v$session_longops.htm

# long transaction and still running ones, based on http://stackoverflow.com/questions/622289/how-to-check-oracle-database-for-long-running-queries?answertab=votes#tab-top

COLUMN percent FORMAT 999.99 
SELECT sid, to_char(start_time,'hh24:mi:ss') stime, 
message,( sofar/totalwork)* 100 percent FROM v$session_longopsWHERE sofar/totalwork < 1
/
6.  oracle performance tuning To find sessions generating lots of redo


methods:  refer to: How to Find Sessions Generating Lots of Redo or Archive logs [ID 167492.1]

1. Query V$SESS_IO
This view contains the column BLOCK_CHANGES which indicates
   how much blocks have been changed by the session. High values indicate a 
   session generating lots of redo.


SELECT s.sid, s.serial#, s.username, s.program,
 i.block_changes
  FROM v$session s, v$sess_io i
  WHERE s.sid = i.sid [and block_changes > 1000 ]
  ORDER BY 5 desc, 1,2,3,4;


# Run the query multiple times and examine the delta between each occurrence
   of BLOCK_CHANGES. Large deltas indicate high redo generation by the session.


2) Query V$TRANSACTION. This view contains information about the amount of
   undo blocks and undo records accessed by the transaction (as found in the 
   USED_UBLK and USED_UREC columns).
  The query you can use is:

SELECT s.sid, s.serial#, s.username, s.program, 
 t.used_ublk, t.used_urec
FROM v$session s, v$transaction t
 WHERE s.taddr = t.addr
ORDER BY 5 desc, 6 desc, 1, 2, 3, 4;

   Run the query multiple times and examine the delta between each occurrence
   of USED_UBLK and USED_UREC. Large deltas indicate high redo generation by 
   the session.


Basic Oracle names:



Move or rename Oracle database datafiles to another directory

Jephe Wu - http://linuxtechres.blogspot.com

Objective: move/rename Oracle database datafiles to another directory
Environment: Oracle 11g R2 11.2.0.1.0 64bit on Oracle Linux 5.7 64bit


Steps:

1. shutdown Oracle DB and start it in mounted state
sqlplus / as sysdba
sql> shutdown immediate;
sql> startup mount;
sql> exit;

or
rman target /
rman> shutdown immediate;
rman> startup mount;

2. check the existing datafiles
sqlplus / as sysdba
sql> set line 32000
sql> select file#,name from v$datafile;
  1* select file#,name from v$datafile

     FILE# NAME

1 /home/oracle/devdb2/system01.dbf
2 /home/oracle/devdb2/sysaux01.dbf
3 /home/oracle/devdb2/undotbs01.dbf
4 /home/oracle/devdb2/users01.dbf


SQL> select file#,name from v$tempfile;

     FILE# NAME

2 /home/oracle/app/apache/oradata/devdb/temp01.dbf

SQL> select member from v$logfile;

MEMBER
/home/oracle/app/oracle/oradata/devdb/redo03.log
/home/oracle/app/oracle/oradata/devdb/redo02.log
/home/oracle/app/oracle/oradata/devdb/redo01.log

3. create new datafile directory 
$ mkdir /home/oracle/devdb3

4. use rman copy file to new directory

RMAN> copy datafile 1 to '/home/oracle/devdb3/system01.dbf';

Starting backup at 04-JUN-12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=18 device type=DISK
channel ORA_DISK_1: starting datafile copy
input datafile file number=00001 name=/home/oracle/devdb2/system01.dbf
output file name=/home/oracle/devdb3/system1.dbf tag=TAG20120604T120342 RECID=1 STAMP=785073858
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:35
Finished backup at 04-JUN-12

RMAN> copy datafile 2 to '/home/oracle/devdb3/sysaux01.dbf';

Starting backup at 04-JUN-12
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile copy
input datafile file number=00002 name=/home/oracle/devdb2/sysaux01.dbf
output file name=/home/oracle/devdb3/sysaux01.dbf tag=TAG20120604T120506 RECID=2 STAMP=785073935
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:35
Finished backup at 04-JUN-12

RMAN> copy datafile 3 to '/home/oracle/devdb3/undotbs01.dbf';

Starting backup at 04-JUN-12
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile copy
input datafile file number=00003 name=/home/oracle/devdb2/undotbs01.dbf
output file name=/home/oracle/devdb3/undotbs01.dbf tag=TAG20120604T120611 RECID=3 STAMP=785073977
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07
Finished backup at 04-JUN-12

RMAN> copy datafile 4 to '/home/oracle/devdb3/users01.dbf';

Starting backup at 04-JUN-12
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile copy
input datafile file number=00004 name=/home/oracle/devdb2/users01.dbf
output file name=/home/oracle/devdb3/users01.dbf tag=TAG20120604T120641 RECID=4 STAMP=785074001
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01
Finished backup at 04-JUN-12

5. use sqlplus to alter database rename file in controlfile

SQL> alter database rename file '/home/oracle/devdb2/system01.dbf' to '/home/oracle/devdb3/system01.dbf';

Database altered.

SQL> alter database rename file '/home/oracle/devdb2/sysaux01.dbf' to '/home/oracle/devdb3/sysaux01.dbf';

Database altered.

SQL> alter database rename file '/home/oracle/devdb2/undotbs01.dbf' to '/home/oracle/devdb3/undotbs01.dbf';

Database altered.

SQL> alter database rename file '/home/oracle/devdb2/users01.dbf' to '/home/oracle/devdb3/users01.dbf';

Database altered.

SQL> alter database open;

Database altered.

SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------
/home/oracle/devdb3/system01.dbf
/home/oracle/devdb3/sysaux01.dbf
/home/oracle/devdb3/undotbs01.dbf
/home/oracle/devdb3/users01.dbf

6. move temporary tablespace files
  1* select * from v$tempfile

     FILE# CREATION_CHANGE# CREATION_     TS#     RFILE# STATUS  ENABLED    BYTES     BLOCKS CREATE_BYTES BLOCK_SIZE NAME

2    1876753 04-MAY-12       3  2 ONLINE  READ WRITE 15728640 1920 10485760 8192

/home/oracle/app/apache/oradata/devdb/temp01.dbf

SQL> alter tablespace temp add tempfile '/home/oracle/tempfile/temp01.dbf' size 10m autoextend on next 10m;

Tablespace altered.

SQL> alter tablespace temp drop tempfile '/home/oracle/app/apache/oradata/devdb/temp01.dbf';

Tablespace altered.

SQL> select name from v$tempfile;

NAME
/home/oracle/tempfile/temp01.dbf


7. move/rename redo log file

SQL> select member from v$logfile;

MEMBER
/home/oracle/app/oracle/oradata/devdb/redo03.log
/home/oracle/app/oracle/oradata/devdb/redo02.log
/home/oracle/app/oracle/oradata/devdb/redo01.log

SQL> select * from v$log;

    GROUP#    THREAD#  SEQUENCE#      BYTES  BLOCKSIZE  MEMBERS ARC STATUS       FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME
---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ ---------
1    1      25   52428800   512 1 NO  CURRENT     2026606 04-JUN-12 2.8147E+14
2    1      23   52428800   512 1 YES INACTIVE     2024705 04-JUN-12    2025593 04-JUN-12
3    1      24   52428800   512 1 YES INACTIVE     2025593 04-JUN-12    2026606 04-JUN-12

SQL> alter database drop logfile group 3;

Database altered.

SQL> alter database add logfile group 3 '^C

SQL> select group#,member from v$logfile;

    GROUP# MEMBER
2 /home/oracle/app/oracle/oradata/devdb/redo02.log
1 /home/oracle/app/oracle/oradata/devdb/redo01.log

SQL> alter database add logfile group 3 '/home/oracle/redolog/redo03.log' size 10m;

Database altered.

SQL> alter database drop logfile group 2;

Database altered.

SQL> alter database add logfile group  2 '/home/oracle/redolog/redo02.log' size 10m;

Database altered.

SQL> select * from v$log;

    GROUP#    THREAD#  SEQUENCE#      BYTES  BLOCKSIZE  MEMBERS ARC STATUS       FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME
---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ ---------
1    1      25   52428800   512 1 NO  CURRENT     2026606 04-JUN-12 2.8147E+14
2    1       0   10485760   512 1 YES UNUSED   0  0
3    1       0   10485760   512 1 YES UNUSED   0  0

SQL> alter database drop logfile group 1;
alter database drop logfile group 1
*
ERROR at line 1:
ORA-01623: log 1 is current log for instance orcl (thread 1) - cannot drop
ORA-00312: online log 1 thread 1: '/home/oracle/app/oracle/oradata/devdb/redo01.log'


SQL> alter system switch logfile;

System altered.

SQL> select * from v$log;

    GROUP#    THREAD#  SEQUENCE#      BYTES  BLOCKSIZE  MEMBERS ARC STATUS       FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME
---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ ---------
1    1      25   52428800   512 1 YES ACTIVE     2026606 04-JUN-12    2028373 04-JUN-12
2    1      26   10485760   512 1 NO  CURRENT     2028373 04-JUN-12 2.8147E+14
3    1       0   10485760   512 1 YES UNUSED   0  0



keep running 'alter system switch logfile' to make group 1 is not current and active, then drop group 1:

SQL> select * from v$log;

    GROUP#    THREAD#  SEQUENCE#      BYTES  BLOCKSIZE  MEMBERS ARC STATUS       FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME
---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- --------- ------------ ---------
1    1      28   52428800   512 1 YES INACTIVE     2028420 04-JUN-12    2028430 04-JUN-12
2    1      29   10485760   512 1 NO  CURRENT     2028430 04-JUN-12 2.8147E+14
3    1      27   10485760   512 1 YES INACTIVE     2028394 04-JUN-12    2028420 04-JUN-12


SQL> alter database drop logfile group 1;

Database altered.

SQL> alter database add logfile group 1 '/home/oracle/redolog/redo01.log' size 10m;

Database altered.

SQL> select member from v$logfile;

MEMBER

/home/oracle/redolog/redo03.log
/home/oracle/redolog/redo02.log
/home/oracle/redolog/redo01.log



How to setup physical standby Oracle database


Jephe Wu - http://linuxtechres.blogspot.com

Environment: Oracle Enterprise Linux 5.7 64bit, 2 node RAC Oracle db 10.2.0.1 as primary db, physical standby db is installed with Oracle database software only.

objective: need to create a physical standby dataguard environment, use one node standby db, we have to create it from scratch, only with database software installed, rman backup from primary db will be used to restore database on physical standby

Typical configuration and some facts:

A Data Guard configuration consists of one primary database and up to nine standby databases

physical standby: not logical standby, which is identical to primary database on block-for-block basis.
redo apply: not sql apply, which controlled by LGWR to write redo log to standby redo log on physical standby server.
Maximum performance: default mode, highest level of data protection that is possible without affecting the performance of the primary database. It allows a transaction to commit as soon as it writes to local redo log, then it will asynchronously writes to physical standby database.

LGWR ASYNC Archival with Network Server (LNSn) Processes:
http://docs.oracle.com/cd/B19306_01/server.102/b14239/log_transport.htm#i1265779

realtime apply from standby redo log directly on physical standy db:
sql>alter database recover managed standby database using current logfile disconnect

Physical standby database will be in mount stage or read-only stage all the times so it will not be generating any redo, therefore online Redologs are not used on the Physical Standby Database (as long as it is acting as a Standby Database).

Even if the online redo logs are not used on a physical standby database, the status within v$log will be updated through the standby control file. They are results of actions on the primary database eg. log switches.
On standby db, run the following to check online redo log status

sqlplus / as sysdba
sql>set line 32767;
sql>select * from v$log;

Requirements:
Must be same:
a. software edition (Oracle enterprise Edition only for both primary and standby)
b. use same 'compatible' initialization parameter
c. primary must be archivelog mode, and primary db must turn on 'force logging' mode
d. set up symmetricaly on primary and standbu db. e.g. ASM and OMF for both primary and standby
e. The archival directories on the standby database should also be identical between sites, including size and structure.
f. LGWR ASYNC transport mode is recommended for all databases.

Terms:

  • a. db_file_name_convert/log_file_name_convert(only for online redolog, not archived redo log which is managed by pfile/spfile)

when oracle instance starts, it read pfile/spfile to create oracle running environment(instance), when it read controlfile and decide where is your datafile and online redo log file, if pfile/spfile got db_file_name_conver or log_file_name_convert, it will internally convert datafile path and online redo log file path first, then read them at the correct updated location.
If it's not set, you may rename them first then
SQL> alter database rename file '<old Filespecification or file#>' to '<new Filespecification>';
  • The SYNC and ASYNC attributes 
When you specify the LGWR attribute but you do not specify either the SYNC or ASYNC attribute, the default is SYNC.

When you specify the ARCH attribute, only the SYNC attribute is valid. An error message is returned if you specify the ARCH and ASYNC attributes together.

The SYNC attribute specifies that network I/O is to be performed synchronously for the destination, which means that once the I/O is initiated, the log writer process waits for the I/O to complete before continuing. The SYNC attribute is a requirement for setting up a no-data-loss environment, because it ensures the redo records are successfully transmitted to the standby destination before continuing.

The ASYNC attribute specifies that network I/O is to be performed asynchronously for the destination.
----------------

1. dmon started when alter system set dg_broker_start=true

2. the standby redo log is only used when the database is running in the standby role,

3. All databases in the Data Guard configuration must use a password file, and the SYS password contained in this password file must be identical on all systems. This authentication can be performed even if Oracle Advanced Security is not installed, and provides some level of security when shipping redo
  • Applying Redo Data to Physical Standby Databases
By default, the redo data is applied from archived redo log files. When performing Redo Apply, a physical standby database can use the real-time apply feature to apply redo directly from the standby redo log files as they are being written by the RFS process.

Note that log apply services cannot apply redo data to a physical standby database when it is opened in read-only mode.
  • Starting Redo Apply

To start log apply services on a physical standby database, ensure the physical standby database is started and mounted and then start Redo Apply using the
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE 

You can specify that Redo Apply runs as a foreground session or as a background process, and enable it with real-time apply.

To start Redo Apply in the foreground, issue the following SQL statement:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE;
If you start a foreground session, control is not returned to the command prompt until recovery is canceled by another session.

To start Redo Apply in the background, include the DISCONNECT keyword on the SQL statement. For example:

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;
This statement starts a detached server process and immediately returns control to the user. While the managed recovery process is performing recovery in the background, the foreground process that issued the RECOVER statement can continue performing other tasks.

This does not disconnect the current SQL session.

To start real-time apply, include the USING CURRENT LOGFILE clause on the SQL statement. For example:

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE;
or
sql> alter database recover managed standby database using current logfile disconnect


  • Stopping Redo Apply


To stop Redo Apply, issue the following SQL statement in another window:

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
==============
the following is from Oracle support - Usage, Benefits and Limitations of Standby Redo Logs (SRL) [ID 219344.1]
======
step by step to create physical standby:

How to create dataguard configuration -
http://www.youtube.com/watch?v=5keahpkzLtM&feature=related

  • on primary: vi /etc/hosts to add FQDN
  • force logging:
sql> SELECT force_logging FROM v$database;
SQL> ALTER DATABASE FORCE LOGGING;
  • password file:
Every database in a Data Guard configuration must use a password file, and the password for the SYS user must be identical on every system for redo data transmission to succeed.
You can just copy the password file from primary to standby once created.

$cd $ORACLE_HOME/dbs/
$ORAPWD FILE=orapwlivedb1 PASSWORD=oracle ENTRIES=5

[Windows]
orapwd file=C:\oracle\product\10.2.0\db_1\database\PWDOrcl.ora password=oracle entries=10 [force=y].


ORAPWD FILE=filename [ENTRIES=numusers] 
   [FORCE={Y|N}] [IGNORECASE={Y|N}] [NOSYSDBA={Y|N}]
   
orapwd is a utility used to create a password file for an Oracle Database. 


Command arguments are summarized in the following table.


Argument Description 
FILE Name to assign to the password file. See your operating system documentation for name requirements. You must supply a complete path. If you supply only a file name, 


the file is written to the current directory. 
ENTRIES (Optional) Maximum number of entries (user accounts) to permit in the file. should set it higher to prevent recreating password file.
FORCE (Optional) If y, permits overwriting an existing password file. 
IGNORECASE (Optional) If y, passwords are treated as case-insensitive. 
NOSYSDBA (Optional) For Data Vault installations. See the Data Vault installation guide for your platform for more information. 


checking who are the users in pwfile:
select * from v$pwfile_users;

  • add standby redo logs for primary db and standby db:


Use the following sql to query the current redo log file size and the number of members.
select * from gv$log;
select * from v$log;
select * from v$standby_log;
SELECT GROUP#,THREAD#,SEQUENCE#,ARCHIVED,STATUS FROM V$STANDBY_LOG;
select * from v$logfile;
SELECT PROCESS, STATUS, THREAD#, SEQUENCE#, BLOCK#, BLOCKS FROM V$MANAGED_STANDBY;


let's say 2 node RAC as primary, each thread has 2 logfile group and each group contains 2 members. total 4 logfile groups. You can use formula

(maximum number of logfiles for each thread + 1) * maximum number of threads

to calculate the number of standby redo logfiles.
for primary db itself, you can use 4+1 standby redo log groups.

sql> SELECT max (group#) from v$logfile; # find out maximum number of log group

sql> alter database add standby logfile group 6 ('/database/livedb/logfile/redo06a.log','/database/livedb/logfile/redo06b.log') size 100m

some example:
----------
#############logfile group###########
alter database add standby logfile  group 12 ('+DATA','+RECOVERY') size 500m
Completed: alter database add standby logfile  group 12 ('+DATA','+RECOVERY') size 500m
alter database add standby logfile  group 14 ('+DATA','+RECOVERY') size 500m
Completed: alter database add standby logfile  group 14 ('+DATA','+RECOVERY') size 500m
alter database add standby logfile  group 15 ('+DATA','+RECOVERY') size 500m
Completed: alter database add standby logfile  group 15 ('+DATA','+RECOVERY') size 500m
alter database add standby logfile  group 17 ('+DATA','+RECOVERY') size 500m
Completed: alter database add standby logfile  group 17 ('+DATA','+RECOVERY') size 500m
alter database add standby logfile  group 19 ('+DATA','+RECOVERY') size 500m
Completed: alter database add standby logfile  group 19 ('+DATA','+RECOVERY') size 500m
alter database add standby logfile  group 20 ('+DATA','+RECOVERY') size 500m
------------

if it's group, use:
ALTER DATABASE ADD STANDBY LOGFILE GROUP 10
('/oracle/dbs/log1c.rdo','/oracle/dbs/log2c.rdo') SIZE 500M;

--------------
######without loggroup, just single logfile#########

ALTER DATABASE ADD STANDBY LOGFILE
'/u01/app/oracle/oradata/orcl/srl01.log'
SIZE 52428800
/
ALTER DATABASE ADD STANDBY LOGFILE
'/u01/app/oracle/oradata/orcl/srl02.log'
SIZE 52428800
/
------------------


If you encounter this error: ORA-01156: recovery in progress may need access to files, means
you cannot add standby redo log while the database is in managed recovery mode.
To fix it: exit managed recovery mode, add the log and then reinitiate managed recovery:

alter database recover managed standby database cancel;
alter database add standby logfile group <n> (<filename>) size <size>;
alter database recover managed standby database using current logfile disconnect

sql> select * from v$standby_log;   # check it again



  • Enable archive log mode:

set the first log archive dest as follows for Windows.
sql> alter system set log_archive_dest_1='LOCATION=c:\oracle\oradb\arch' scope=both;


  • modify pfile parameter for primary server

DB_NAME Specifies the database name. e.g livedb

DB_UNIQUE_NAME Specify a unique name for each database. Does not change even if DG roles change.
CONTROL_FILES Specifies the local path name for the control files on the primary database.
LOG_ARCHIVE_CONFIG Uses the DG_CONFIG attribute to list the DB_UNIQUE_NAME of the primary and standby databases.
LOG_ARCHIVE_DEST_1 Defaults to archive destination for the local archived redo log files.
LOG_ARCHIVE_DEST_2 Valid only for the primary role, this destination transmits redo data to the remote physical standby destination standby1.
REMOTE_LOGIN_PASSWORDFILE Must be EXCLUSIVE or SHARED if a remote login password file is used (default = EXCLUSIVE)
LOG_ARCHIVE_DEST_STATE_n Must be ENABLE (default)
-------------
sqlplus / as sysdba
sql> show parameter spfile;
sql> cretae pfile from spfile;
sql>exit;
cd $ORACLE_HOME/dbs
vi initlivedb1.ora
add somthing according to page as follows:
http://docs.oracle.com/cd/B19306_01/server.102/b14239/create_ps.htm#i63561

*.log_archive_dest_1='location=/recovery/livedb/archivelog VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=livedb'
*.log_archive_dest_2='SERVICE=drdb LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=drdb'
*.LOG_ARCHIVE_DEST_STATE_1='ENABLE'
*.LOG_ARCHIVE_DEST_STATE_2='ENABLE'
*.LOG_ARCHIVE_FORMAT='%t_%s_%r.arc'
*.LOG_ARCHIVE_MAX_PROCESSES=30
*.remote_login_passwordfile='exclusive'


*.FAL_CLIENT='livedb'
*.FAL_SERVER='drdb'
*.STANDBY_FILE_MANAGEMENT='AUTO'
Note: Set to AUTO so when datafiles are added to or dropped from the primary database, corresponding changes are made automatically to the standby database.
*.DB_FILE_NAME_CONVERT='drdb','livedb'
*.LOG_FILE_NAME_CONVERT='/database/drdb/','/database/livedb/','/recovery/drdb/','/recovery/livedb/'
----------

When you are doing stuff on physical standby node, you at least need to configure the following in pfile:
db_name=something
db_unique_name=something for this instance only
-------


Notes: about XDB
a. *.dispatchers is for shared server configuration, by default, dedicated server is always enabled, you can configure dispatchers for shared server
sql> alter system set dispatchers='(PROTOCOL=TCP) (SERVICE=drdbXDB)';
then
sql> shutdown immediate;
sql> startup;
$ lsnrctl stop
$ lsnrctl start
$ lsnrctl status # check if drdbXDB is there

XML DB is for enabling file transfer since Oracle 10gR1,

b. how to disable XDB and XPT
XPT: since 10.2, there's a servide name called 'xx_XPT' is registered in the listener while xx is the normal service name. It is used in dataguard environment, add this __dg_broker_service_names='' into spfile to disable XPT if you don't use dataguard
XDB: XMLDB, for enabling file transfer since Oracle 10gR1 between normal cooked file system and ASM.
to enable file transfer betwen them, do this:
sql> desc resource_view # make sure got output
sql> select dbms_xdb.gethttpport from dual;
sql> select dbms_xdb.getftpport from dual;
sql> exec dbms_xdb.sethttpport(8080);
sql> exec dbms_xdb.setftpport(2100);
ASM virtual folder is created in /sys/asm.

other methods for accessing ASM files are: asmcmd(since Oracle 11g), dbms_file_transfer package. google 'transfering files between ASM and os file system' to find out more.


  • check archive log is enabled on primary


sql>archive log list;
sql> shutdown immediate;
sql> startup mount;
sql> alter database archivelog;
sql> alter database open;
sql> archive log list;



for disabling it:
sql>shutdown immediate
sql>startup mount
sql>alter database noarchivelog;
sql>alter database open;
  • Prepare tnsnames.ora and listener.ora on primary db

append the following to $ORACLE_HOME/network/admin/tnsnames.ora

STANDBY1 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = standby1.example.com)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = standby1.example.com)
    )
  )

append the following to $ORACLE_HOME/network/admin/listener.ora

SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (GLOBAL_DBNAME = standby1.example.com)
      (ORACLE_HOME = /u01/app/oracle/product/11.2.0/dbhome_1)
      (SID_NAME = standby1)
    )
  )

note: use tnsping to check the connectivity

  • take rman backup on primary db:

rman target /
rman> show all
rman> set CONFIGURE CONTROLFILE AUTOBACKUP ON
rman> backup database format '/tmp/bak/%U';


Note: you don't have to recover database using archivelog, you can just transfer this database backup to restore on standby server, then start 'ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE disconnect;' on standby server.


or


rman> backup database plus archivelog;
or

RMAN>backup database format '/tmp/dbbkp/PRIMA_%U';
RMAN>backup archivelog all format '/tmp/dbbkp/PRIMA_ARC_%U';


  • create standby controlfile on primary:
RMAN> backup current controlfile for standby format '/tmp/ForStandbyCTRL.bck';
or
sql> ALTER DATABASE CREATE STANDBY CONTROLFILE AS '/tmp/standbyctl.sql';



copy above backup files(datafile, controlfile, pfile) etc to secondary db at the exact same directory
because rman use control file instead of recovery catalog, the backupset catalog info has been recorded into control file.



  • create drdb tnsname on primary:

---------------------
cd /u01
find . -name "tnsname*"
vi tnsname file
add drdb part


  • Transfer backups to standby server

copy above backup files(datafile, controlfile, pfile) etc to secondary db at the exact same directory because rman use control file instead of recovery catalog, the backupset catalog info has been recorded into control file.

  • startup with new parameters for primary db RAC:

sql> shutdown immediate; # shutdown both RAC databases first
sql> create pfile='/tmp/pfile' from spfile;
sql> create spfile from pfile='/tmp/pfile'
sql> startup;
ORACLE instance started.
Total System Global Area  440401920 bytes
Fixed Size    2021408 bytes
Variable Size  130025440 bytes
Database Buffers  306184192 bytes
Redo Buffers    2170880 bytes
ORA-01105: mount is incompatible with mounts by other instances
ORA-01677: standby file name convert parameters differ from other instance

Solution: shutdown another RAC instance;then startup again for rac1 instance;

sql> select * from v$database;
sql> select * from v$instance;

  • copy pfile to standby and build spfile
-----------------------------------------------
vi pfile, modify accordingly
sqlplus / as sysdba
sql> startup nomount ; to test pfile

*.log_archive_dest_1='location=/recovery/drdb/archivelog VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=drdb'
*.log_archive_dest_2='SERVICE=livedb LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=livedb'
*.LOG_ARCHIVE_DEST_STATE_1='ENABLE'
*.LOG_ARCHIVE_DEST_STATE_2='ENABLE'
*.LOG_ARCHIVE_FORMAT='%t_%s_%r.arc'
*.LOG_ARCHIVE_MAX_PROCESSES=30
*.LOG_FILE_NAME_CONVERT='/database/livedb/','/database/drdb/','/recovery/livedb/','/recovery/drdb/'
*.open_cursors=300
*.remote_login_passwordfile='exclusive'
*.STANDBY_FILE_MANAGEMENT='AUTO'


*.FAL_CLIENT='drdb'
*.FAL_SERVER='livedb'
*.db_unique_name='drdb'
*.DB_FILE_NAME_CONVERT='livedb','drdb'
*.db_name='livedb'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=drdbXDB)'

LOG_ARCHIVE_CONFIG='DG_CONFIG=(livedb,drdb)'
# or  alter system set LOG_ARCHIVE_CONFIG='DG_CONFIG=(livedb,drdb)'; after that.

Note: Specify the DG_CONFIG attribute on this parameter to list the DB_UNIQUE_NAME of the primary and standby databases in the Data Guard configuration; this enables the dynamic addition of a standby database to a Data Guard configuration that has a Real Application Clusters primary database running in either maximum protection or maximum availability mode. By default, the LOG_ARCHIVE_CONFIG parameter enables the database to send and receive redo; after a role transition, you may need to specify these settings again using the SENDNOSENDRECEIVE, or NORECEIVE keywords.

  • if the standby db is running on Windows, use the command below to create service


winnt> oradim -new -sid standby1 -intpwd password -startmode manual

  • create password file on standby:

$ orapwd FILE=orapwlivedb1 PASSWORD=oracle ENTRIES=5

  • restore the control file for standby db
sqlplus / as sysdba
sql> startup nomount
sql> exit;

rman target /
rman> RESTORE STANDBY CONTROLFILE FROM '/tmp/ForStandbyCTRL.bck';

  • restore database for standby db

copy rman database backupset files, standby controlfile to standby db:

sqlplus / as sysdba
sql> startup mount;  #only read controlfile and check intergrity without actually open datafiles, redo logfiles etc
sql> select name from v$datafile;
sql> select member from v$logfile;
sql> select * from v$log;
sql> select * from v$ogfile;
sql> select name from v$tempfile;
# check the filename are all correct path and start creating the directory for all above path, and directories for those alert log files for adump,bdump.udump etc in spfile.
all those directories must exist first. check alertdrdb.log to find out any alert message for missing out any directory.

rman target /
rman> catalog starts with '/tmp/rman'
rman> restore database;
rman> recover database; # it might fail until the redo log sequence.

NOTE: The log files and the temp files are not copied. The log file are created on the
standby database when
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT
will run for the first time;
The TEMP tablespace will be created without a temp file. This file will be created when the physical standby database is opened as read only.



Note:  in 11g, you can duplicate database through rman as follows:
------------
rman> duplicate target database for standby from active database spfile
set db_name='prmdb'
set db_unique_name='strdb'
set db_recovery_file_dest='e:\oracle\FRA'
set control_files='...,...'
set db_file_name_convert='....,','...'
set log_file_name_convert='....,','....'
----------



Some other rman usage, might not related to dataguard:
RMAN> run {
  set until time to_date('04-Aug-2004 00:00:00', 'DD-MON-YYYY HH24:MI:SS');
  restore database;
  recover database [using backup control file until cancel] [noredo];
}


rman> alter database open resetlogs;

-------------
RMAN> run {
2> allocate channel dev1 type disk;
3> backup
4>   format '/app/oracle/archback/log_%t_%sp%p'
5>   (archivelog all delete input);
6> release channel dev1;
7> }


rman> LIST BACKUP OF ARCHIVELOG FROM TIME 'sysdate-1';
----------
use rman to duplicate database;
connect target sys/secure@origdb
connect catalog rman/rman@catdb
connect auxiliary /


run {
set newname for datafile 1 to '/ORADATA/u01/system01.dbf';
set newname for datafile 2 to '/ORADATA/u02/undotbs01.dbf';
set newname for datafile 3 to '/ORADATA/u03/users01.dbf';
set newname for datafile 4 to '/ORADATA/u03/indx01.dbf';
set newname for datafile 5 to '/ORADATA/u02/example01.dbf';


allocate auxiliary channel dupdb1 type disk;
set until sequence 2 thread 1;


duplicate target database to dupdb
logfile
  GROUP 1 ('/ORADATA/u02/redo01.log') SIZE 200k REUSE,
  GROUP 2 ('/ORADATA/u03/redo02.log') SIZE 200k REUSE;
}
-----------------------


  • create tnsname on standby db for primary instance and services:

-------------------------------
cd /u01
vi /etc/hosts to add primary RAC server vip ip address first

find . -name "tnsname*"
vi tnsname file
to add livedb

note: remember to use vip domain name instead of server real ip address.

  • Enable flashback database after finishing configuration: for both primary and standby

on primary:
sqlplus / as sysdba
sql> shutdown immediate;
sql> startup mount;
sql> alter database flashback on;
sql> select flashback_on from v$database
sql> alter database open;

on standby:
sqlplus / as sysdba
sql> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
sql> select flashback_on from v$database;
sql> alter database flashback on;
database flashback has already turned on  # will give error if trying to turn on multiple times.
sql> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE disconnect;


  • Starting Up a Physical Standby Database

----------------------
sql> startup; # starts the database, mounts the database as a physical standby database, and opens the database for read-only access.
sql> STARTUP MOUNT; # starts and mounts the database as a physical standby database, but does not open the database. Once mounted, the database can receive archived redo data from the primary database. You then have the option of either starting Redo Apply or real-time apply, or opening the database for read-only access.

For example:
Start and mount the physical standby database:
SQL> STARTUP MOUNT;
Start Redo Apply or real-time apply:
To start Redo Apply, issue the following statement:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;

To start real-time apply, issue the following statement:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE disconnect;

On the primary database, query the RECOVERY_MODE column in the V$ARCHIVE_DEST_STATUS view, which displays the standby database's operation as MANAGED_RECOVERY for Redo Apply and MANAGED REAL TIME APPLY for real-time apply.

Note: how to know which mode (realtime apply or not)
sql>select recovery_mode from v$archive_dest_status; #run on primary db



  • How to check if physical standby db is syncing with primary db properly:

use the following shell/sql scripts:

[oracle@ bin]$ more redolog_standby.sh
#!/bin/sh
ORACLE_BASE=/u01/app/oracle
export ORACLE_BASE
ORACLE_HOME=/u01/app/oracle/product/10.2.0/db_1
export ORACLE_HOME
ORACLE_SID=DRDB
export ORACLE_SID
PATH=/u01/app/oracle/product/10.2.0/db_1/bin:/u01/app/oracle/product/10.2.0/db_1/OPatch:/home/oracle/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:$ORACLE_HOME/bin
export PATH

sqlplus / as sysdba @/home/oracle/standby.sql > /home/oracle/standby.txt

tail -4  /home/oracle/standby.txt  | head -1 | sed -e 's#^ *##g' > /home/oracle/standby.txt.status

email -f jwu@domain.com -n standby -s "`cat /home/oracle/standby.txt.status` DB standby redo log" -r mail.domain.com  jwu@domain.com < /home/oracle/standby.txt
------------
$ more /home/oracle/standby.sql
SELECT * FROM (
SELECT sequence#, archived, applied,
      TO_CHAR(completion_time, 'RRRR/MM/DD HH24:MI') AS completed
 FROM sys.v$archived_log
  ORDER BY completed DESC)
WHERE ROWNUM <= 20;

SELECT * FROM (
SELECT sequence#, archived, applied,
      TO_CHAR(completion_time, 'RRRR/MM/DD HH24:MI') AS completed
 FROM sys.v$archived_log
  ORDER BY completed DESC)
where applied='NO';


SELECT ARCH.THREAD# "Thread", ARCH.SEQUENCE# "Last Sequence Received", APPL.SEQUENCE# "Last Sequence Applied", (ARCH.SEQUENCE# - APPL.SEQUENCE#) "Difference"
FROM
(SELECT THREAD# ,SEQUENCE# FROM V$ARCHIVED_LOG WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$ARCHIVED_LOG GROUP BY THREAD#)) ARCH,
(SELECT THREAD# ,SEQUENCE# FROM V$LOG_HISTORY WHERE (THREAD#,FIRST_TIME ) IN (SELECT THREAD#,MAX(FIRST_TIME) FROM V$LOG_HISTORY GROUP BY THREAD#)) APPL WHERE ARCH.THREAD# = APPL.THREAD#
ORDER BY 1;
/* last sequence received and last sequence applied number should all be same */

SELECT THREAD#, LOW_SEQUENCE#, HIGH_SEQUENCE# FROM V$ARCHIVE_GAP;
/* should not have rows selected  */

exit;
-----------------------------
Note:
a. if v$archive_GAP doesn't return any rows but the gap actually exists, you can refer to
Oracle support - V$ARCHIVE_GAP doesn't show a existing GAP [ID 974730.1] for another query.
b. select * from v$log; on both primary and standby, to check sequence# part to make sure the number for 'current' are same.





How to manage dataguard/active dataguard


  • Shuting down a Physical Standby Database
----------------------
To shut down a physical standby database and stop Redo Apply, use the SQL*Plus SHUTDOWN statement. Control is not returned to the session that initiates a database shutdown until shutdown is complete.

If the primary database is up and running, defer the destination on the primary database and perform a log switch before shutting down the standby database.
sql> alter system switch logfile; # on primary

To stop Redo Apply before shutting down the database, use the following steps:

Issue the following query to find out if the standby database is performing Redo Apply or real-time apply. If the MRP0 or MRP process exists, then the standby database is applying redo.
SQL> SELECT PROCESS, STATUS FROM V$MANAGED_STANDBY;

If Redo Apply is running, cancel it as shown in the following example:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

Shut down the standby database.
SQL> SHUTDOWN IMMEDIATE;

  • open standby db for read-only access

-------------------
in 10g, having a physical standby database open for read-only access makes it unavailable for managed recovery operations
sql> startup;
or
sql> startup nomount
sql> alter database mount standby database;
sql> alter database open read only;

if standby db is currently performaing managed recovery:
sql> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
sql> alter database open read only;

changing standby db from readonly mode to managed recovery:
terminate all active user sessions first, then
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;
note: just above command only, no need to exit read only or something like that


  • Enabling active data guide in 11g

a. firstly disable redo apply in standby db
dgmgrl> edit database drdb set state=apply-off;

or
alter database recover managed standby database cancel;


dgmgrl> show database verbose drdb

b. use sqlplus to 'alter database open';

c. reactivate redo apply
dgmgrl> edit database drdb set state=apply-on;
dgmgrl> show database drdb;


  • to disable active dataguard, steps:


sql> shutdown database immediate;
sql> startup mount;

dgmgrl> edit database stdby set state=apply-on
or
alter database recover managed standby database disconnect;



Snapshot standby database

  • standby database must be physical standby db
  • flashback logging must be enabled on both production and standby db


dgmgrl
connect /
show configuration

dgmgrl> convert database stdby to snapshot standby;
dgmgrl> show configuraton
stdby - snapshot standby database

sql> select name,open_mode, switchover_status,database_role from v$database
database_role
snapshot standby

Now you can do anything with the database, from now onward, all the redo generated in our production db will be still shipped to standby db. But it's not going to apply until db is converted into physical standby mode

once test finished, convert it back:

dgmgrl> convert database stdby to physical standby;

it will terminate all changes mode to snapshot standby db, MRP process is initiated and apply all logs was shipping during snapshot standby db state.


Troubleshooting dataguard/dgmgrl
  • Verify the Physical Standby Database Is Performing Properly
 Identify the existing archived redo log files:
SELECT SEQUENCE#, FIRST_TIME, NEXT_TIME FROM V$ARCHIVED_LOG ORDER BY SEQUENCE#;

force log switch on primary as 'ALTER SYSTEM SWITCH LOGFILE;'

Verify the new redo data was archived on the standby database.

Verify new archieved redo log files were applied
SELECT SEQUENCE#,APPLIED FROM V$ARCHIVED_LOG ORDER BY SEQUENCE#;

  •  archive log gap list


select * from v$archive_gap;
select name from v$archived_log where sequence#=11104; 
  • shutdown immediate takes forever
Sometimes Oracle takes forever to shutdown with the "immediate" option. As workaround to this problem, shutdown using these commands:

alter system checkpoint;
shutdown abort
startup restrict (only users granted restricted session system privileges can connect)
shutdown immediate


  • What if the standby db is not syncing with primary db?

identify problem:
a. above query is showing gaps for 'difference'.
b. select * from v$log; check sequence# for 'current'; on both primary and standby to see if the number for 'current' is same

troubleshooting:
a. check alert log for both primary and standby db
b. could be due to log_archive_config='DG_CONFIG=(livedb,drdb)' configuration.

You need to add ths log_archive_config on both primary and secondary, and set like this:

Parameters for primary:
*.log_archive_config='dg_config=(livedb,drdb)'
*.db_name='livedb'
*.db_unique_name='livedb'

Parameters for standby:
*.log_archive_config='dg_config=(livedb,drdb)'
*.db_name='livedb'
*.db_unique_name='drdb'

Otherwise, you might got error on both primary RAC server alert log as follows:
ORA-16057 DGID from server not in Data Guard configuration

You can refer to My Oracle Support(MOS) - ORA-16057 in Dataguard Configuration [ID 827954.1]


dataguard dg_broker 

  • dgmgrl parameters


primary RAC:
*.dg_broker_config_file1='+DATA/livedb/dglivedb1.dat'
*.dg_broker_config_file2='+RECOVERY/livedb/dglivedb2.dat'
*.dg_broker_start=TRUE

standby:
*.dg_broker_config_file1='+DATA/livedb/dgdrdb1.dat'
*.dg_broker_config_file2='+RECOVERY/livedb/dgdrdb2.dat'
*.dg_broker_start=TRUE

you can use 'alter system set' for all above 3 parameters:



  • use dgmgrl

dgmgrl> show configuration
dgmgrl> disable configuration;
dgmgrl> enable configuration;
note: If you disable management of a configuration while connected to the standby database, you must connect to the primary database to reenable the configuration.

dgmgrl> remove configuration;
dgmgrl> remove database xxx;
Note: when you use either the REMOVE CONFIGURATION or REMOVE DATABASE command, you effectively delete the configuration or standby database profile from the broker configuration file, removing the ability of the Data Guard broker to manage the configuration or the standby database, respectively.

 dgmgrl> edit database stdby set state=apply-off  # temporarily stop Redo Apply on a physical standby
 dgmgrl> EDIT DATABASE North_Sales SET STATE=TRANSPORT-OFF; #stop the transmittal of redo data to the standby database
 DGMGRL> EDIT DATABASE North_Sales SET STATE=TRANSPORT-ON; # revert back

When you disable the broker configuration or any of its databases, you are disabling the broker's management of those objects and are effectively removing your ability to use DGMGRL to manage and monitor the disabled object.

References:

http://download.oracle.com/docs/cd/B19306_01/server.102/b14239/scenarios.htm#CIHIAADC

  • create dataguard configuration:  
http://docs.oracle.com/cd/B28359_01/server.111/b28295/cli.htm
http://docs.oracle.com/cd/B12037_01/server.101/b10822/cli.htm

DGMGRL> create configuration 'DRSolution' as primary database is 'livedb' connect identifier is 'livedb';
DGMGRL> add database 'drdb' as connect identifier is drdb maintained as physical;
Database "drdb" added
dgmgrl> enable database drdb # if necessary
dgmgrl> enable configuration
dgmgrl> show configuration
dgmgrl> show database verbose ocmdb;
DGMGRL> show database LIVEDB inconsistentproperties;

dgmgrl> EDIT DATABASE 'livedb' SET PROPERTY 'LogArchiveTrace'='127';
dgmgrl> SWITCHOVER TO "drdb";


Warning: ORA-16607: one or more databases have failed

solution: check alert log from primary, after enabling dataguard configuration, it changed spfile log_archive_dest_2 definition.

You should add drdb definitions as follows in tnsname.ora on primary db:
----------
DRDB_XPT =
  (DESCRIPTION =
    (SDU=32767)
    (ADDRESS = (PROTOCOL = TCP)(HOST = oraslave.domain.com )(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = DRDB_XPT)
    )
  )

DRDB =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = oraslave.domain.com)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = DRDB)
    )
  )

DRDB_DGMGRL =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = oraslave.domain.com )(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = DRDB_DGMGRL.)
    )
  )

[oracle@oraslave u01]$ dgmgrl
DGMGRL for Linux: Version 10.2.0.1.0 - 64bit Production

Copyright (c) 2000, 2005, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
DGMGRL> connect /
show configurationConnected.
DGMGRL> ;

Configuration
  Name:                DRSolution
  Enabled:             YES
  Protection Mode:     MaxPerformance
  Fast-Start Failover: DISABLED
  Databases:
    livedb - Primary database
    drdb   - Physical standby database

Current status for "DRSolution":
Error: ORA-16625: cannot reach the database

=> check log file for broker first /u01/app/oracle/product/10.2.0/db_1/admin/livedb/bdump/drclivedb2.log





how to rename configuration dgmgrl file or change configuration file location?
Dgmgrl> disable configuration;
Sql> alter system set dg_broker_start=false;
Sql> alter system set dg_broker_config_file1=/database/livedb/dgconfig/drLIVEDB1.dat
Sql> alter system set dg_broker_config_file2=/recovery/livedb/dgconfig/drLIVEDB2.dat
Sql>exec dbms_file_transfer.copy_file('ASM_DIR','drLIVEDB1.dat','DG_DIR','drLIVEDB1.dat');
Sql> exec dbms_file_transfer.copy_file('ASM_DIR2','drLIVEDB1.dat','DG_DIR','drLIVEDB2.dat');

$ cp /tmp/drLIVEDB1.dat /database/livedb/dgconfig/
$ cp /tmp/drLIVEDB2.dat /recovery/livedb/dgconfig/

Sql> alter system set dg_broker_start=true;
Dgmgrl> alter system set dg_broker_start=true;
Sql> show parameter dg_broker_start

Dgmgrl> enable configuration;


dgmgrl failover or switch

  • performing a switchover from dgmgrl
Switch Over to Standby database phyocm DGMGRL> switchover to phyocm;
  • peforming a failover from dgmgrl

On the target standby database, issue the FAILOVER command to invoke a complete failover, specifying the name of the standby database that you want to change into the primary role: DGMGRL> FAILOVER TO database-name;


  • performing fast-start failover
If there is more than one standby database in the configuration, you must explicitly set the FastStartFailoverTarget property on the primary database and target standby database to point to each other for the purpose of defining which standby database will be the target of a fast-start failover. For example: DGMGRL> EDIT DATABASE 'North_Sales' SET PROPERTY FastStartFailoverTarget = 'DR_Sales'; DGMGRL> EDIT DATABASE 'DR_Sales' SET PROPERTY FastStartFailoverTarget = 'North_Sales'; dgmgrl> ENABLE FAST_START FAILOVER;


  • Using Flashback Database After a Failover


After a failover occurs, the original primary database can no longer participate in the Data Guard configuration until it is repaired and established as a standby database in the new configuration. To do this, you can use the Flashback Database feature to recover the failed primary database to a point in time before the failover occurred, and then convert it into a physical or logical standby database in the new configuration. The following sections describe:

Flashing Back a Failed Primary Database into a Physical Standby Database -

http://docs.oracle.com/cd/B19306_01/server.102/b14239/scenarios.htm#i1049997
Flashback Database and Data Guard -
http://mdinh.wordpress.com/2007/06/27/flashback-database-and-data-guard/

failover(unplanned) and switchover(planned):
===============
switchover is planned event while failover is unplanned. after switchover, due to hardware upgrade, database upgrade etc, you still can switch it back.

You can use EM or dgmgrl for switchover, and it's easier to use dgmgrl for failover because the primary db is already gone in case of failover.

References:
Creating a physical standby datbase
http://docs.oracle.com/cd/B19306_01/server.102/b14239/create_ps.htm

Steps to recreate a Physical Standby Controlfile [ID 459411.1]

Online Redo Logs on Physical Standby [ID 740675.1]
for how to add/drop redo log on standby

Physical standy db lags far behind the primary database:
http://docs.oracle.com/cd/B19306_01/server.102/b14239/scenarios.htm#CIHEGFEG
http://docs.oracle.com/cd/E11882_01/server.112/e10700/rman.htm#CIHIAADC


Oracle Internal - http://www.ordba.net/Articles/OInternals.htm#Part3

Data disk - http://www.datadisk.co.uk/

Dataguard Architecture - http://www.datadisk.co.uk/html_docs/oracle_dg/architecture.htm

switchover or failover - http://www.youtube.com/watch?v=86KDwCmfwRg


http://docs.oracle.com/cd/B28359_01/server.111/b28295/cli.htm


Commands:
SQL> alter database rename file '<old Filespecification or file#>' to '<new Filespecification>';
SQL> alter database add standby logfile group <Group No.> ('Filespecification(s)') size <size> reuse;
SQL> alter database recover managed standby database disconnect [using current logfile];  (real time apply)

sql> select * from v$archive_gap;  #on standby db
sql> select * from v$database;
sql> select * from v$instance;
sql> select * from v$archive_dest;


add standby logfile group:
SQL> alter database add standby logfile group 4 ('C:\ORACLE\ORADATA\STANDBY\STBY04.LOG') SIZE 5M;

You can also add any further members to each group:
SQL> alter database add standby logfile member 'C:\ORACLE\ORADATA\STANDBY\STBY14.LOG' to group 4;


SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE;
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE disconnect;
SQL>
note: 'using corrent logfile' means realtime apply which needs standby redo log created first.

sql> select recovery_mode from v$archive_dest_status; #run on primary db
RECOVERY_MODE
-----------------------
IDLE
MANAGED REAL TIME APPLY
IDLE
IDLE
IDLE
IDLE
IDLE
IDLE
IDLE
IDLE

note: if you see only 'managed', not 'managed real time apply', check if you are using local spfile instead of RAC spfile file which should be sitting in shared common place:
if it's showing 'UNKNOWN', run 'alter system switch logfile' to see if it will change to 'managed real time apply';

sql> SELECT PROCESS, STATUS FROM V$MANAGED_STANDBY; # on standby db to check if standby db is applying redo log, if in read only mode, no MRP0 process can be seen, if it's in physical standby realtime applying redo mode, it can see something like this:
MRP0  APPLYING_LOG
note: MRP stands for Managed Recovery Process

sql> select open_mode from v$database;  # on standby db to check if database is opened readonly, so in 'read only' mode; normally, it should be 'mounted' for physical standby db.


Oracle dataguard terms:
MRP - Managed Recovery Process
LSP - Logical Standby Process
SRL - Standby Redo Log
RFS - Remote file server backgroud process (receiving archived redo-logs from primary db)
FAL - Fetch Archive Log
LNS - log-write network-server
ARCH - archiver process
extproc_connection_data - is a special service name that is used for external procedures.