Jephe Wu - http://linuxtechres.blogspot.com
Objective: showing step by step instruction for doing non-disruptive DR test and real world DR on existing logshipping SQL server 2012 production environment
Environment: SQL server 2012 prod and DR servers replicating through logshipping, both are running on Windows 2008 R2 SP1 64bit OS, production database name is LINUXTECHRES_LIVE, application server use LINUXTECHRES to connect to database.
High Level Concept
1. disable restore job and alert job on DR SQL server
2. run command 'RESTORE DATABASE LINUXTECHRES_LIVE WITH RECOVERY' on DR SQL server new query window to bring database to online mode (refresh it to change status to remove 'restoring' after run command)
3. find out which username is being used by application server to connect to production database
4. create same username with same sid on DR sql server
5. try to login with application username and password on DR sql server with sql authentication.
6. rebuild DR database to enable logshipping again after done DR test
Steps
1. Preparation work before the actual day of DR test
3. Create application server username with password on DR SQL server so that DR application server is able to connect to DR database
sql server login on prod sql server has unique sid associated with it, you have to create username with same sid on DR side to enable the database user to be able to connect to replicated database on DR instance. each sid is associated with an SQL server instance which is unique.
run command below on prod sql server to find out sid of application user.
SELECT name, [sid] FROM sys.server_principals WHERE [type] = 's';
e.g. app user LINUXTECHRES sid is 0x5B56330D7270CE4FB396226DA772ABCD
on prod sql server, expand prod database which is being logshipping to DR site, check database users to find out which user is used for application server to connect.
run command below on DR sql server query window to create application username:
CREATE LOGIN LINUXTECHRES WITH PASSWORD = 'strong password in plaintext', sid = 0x5B56330D7270CE4FB396226DA772ABCD, DEFAULT_DATABASE=[LINUXTECHRES_LIVE], DEFAULT_LANGUAGE=[us_english], CHECK_EXPIRATION=OFF, CHECK_POLICY=ON;
GO
4. try to login to DR sql server as app user linuxtechres with sql authentication
You should encounter error message like 'cannot open user default database, login failed', because the DR database is still in restoring state which cannot be accessed. Once we bring it online, you should be able to login to it.
5. Make sure the last backup transaction logfile on production server has already been copied and restored on DR site - optional
run this on prod to check last_backup file name -
select last_backup_file from msdb.dbo.log_shipping_monitor_primary
run this on DR to check last_copied_file and last_restored_file name -
select last_copied_file, last_restored_file from msdb.dbo.log_shipping_monitor_secondary
If the latest backup file has not yet been copied or restored on DR site, manually run file copy and restore job on DR server to bring all 3 filenames are same
6. break logshipping and bring DR database online
Disable restore and alert job on DR server [optional, to avoid job failure after bring DR database online ]
Run command below to bring DR database online
RESTORE DATABASE LINUXTECHRES_LIVE WITH RECOVERY
do not run 'backup log with norecovery' on primary prod db, which will be doing tail log backup, as it will change prod database status to 'restoring', We are doing non-disruptive DR test only, everything is still running independently on prod side.
7. You should be able to connect to database now
8. rebuild logshipping after DR test.
copy the latest full backup of database from prod to DR server, then run command below in SSMS query window.
RESTORE DATABASE LINUXTECHRES_LIVE FROM DISK = 'D:/backup/prod_full_backup_linuxtechres.bak' WITH NORECOVERY, REPLACE;
or you can use SSMS, right clicking database LINUXTECHRES_LIVE, choose 'task', 'restore database' in SQL server 2012, if the logshipping DR has been running long time, the screen might need long time to appear. Then, source for restore, from device, add a file which is copied from production daily full backup, tick it to choose it. then it will do restore headonly to give brief detail, if it says timeout, logout SSMS , login and try again.
Go to options, choose 'overwrite the existing database', for 'Recovery state', choose the second options which is 'Leave the database non-operational, and do not roll back uncommitted transactions, Additional transaction logs can be restored (RESTORE WITH NORECOVERY), then click 'OK'. After finishing it, the restore job will start it automatically and database will automatically try to find the transaction log file to recovery, you can right click 'restore job' , choose history to see the progress.
8. If Real DR is happening
Before above step 6, you should make sure you have restored the latest backup file on prod site, so manually run copy/restore job on DR side to bring all these 3 filename are same.
Then you should also do tail log backup on prod side if possible to reduce data lost. run this:
USE master
GO
BACKUP LOG LINUXTECRES_LIVE
TO DISK = 'C:\backups\prod_tail_log.bak'
WITH NORECOVERY
then copy above prod_tail_log.bak to DR sql server then restore it manually like this:
RESTORE LOG LINUXTECHRES_LIVE
FROM DISK = 'path_to_prod_tail_log.bak'
WITH RECOVERY
References
1. http://www.sqlskills.com/blogs/glenn/how-to-avoid-orphaned-database-users-with-sql-server-authentication/