How to use hardware RAID10 to clone Linux between 2 Dell R720 servers


 Jephe Wu - http://linuxtechres.blogspot.com

Environment: 2 Dell R720 (db01 and db02), need to clone db01 to db02 and wipe off everything on db02 hard disks which was used for other purpose.
db01 has 6 SAS 300G 15k rpm hard disk and configured as RAID10.
IP addresses for 2 servers: db1: 192.168.0.1 db2: 192.168.0.2

Objective: to clone server db01 to db02 and change ip address.



Steps:

1. Cloning hard disk

While db01 is running, remove hard disk ID 1,3,5 , insert any 3 hard disks from db02 while db02 is power off, then it will automatically start rebuilding raid10, you can observe it from Dell Openmanage page https://192.168.0.1:1311 for db01.

Put hard disk ID 1,3,5 from db01 into slot 0,2,4 on db05 which is the top row of disk slots, keep slot 1,3,4 empty first, then power on db02.

On boot up screen, it will say all original disk has been removed, press F to import foreign config.
You have to press it quickly otherwise, it will continue and fail to bootup Linux due to no boot up device found.

Press F so it can continue to boot Linux OS which is same as the one on db01

2. Configure network on db02 after OS boot up

As the hard disk comes from db01, after OS boots up, everything is same as db01, you need to change the following:

a. hostname. 
run commands below:
# hostname hmspzdb05
# vi /etc/sysconfig/network to change hostname to hmspzdb05
# service network restart


b. ip address
# cd /etc/sysconfig/network-scripts
# dmesg | grep eth  # to get all network NICs mac address and network card module name (igb)

[root@db01 ~]# dmesg | grep eth
igb 0000:01:00.0: eth0: (PCIe:5.0Gb/s:Width x4) bc:20:5b:ed:95:28
igb 0000:01:00.0: eth0: PBA No: G10565-011
igb 0000:01:00.1: eth1: (PCIe:5.0Gb/s:Width x4) bc:20:5b:ed:95:29
igb 0000:01:00.1: eth1: PBA No: G10565-011
igb 0000:01:00.2: eth2: (PCIe:5.0Gb/s:Width x4) bc:20:5b:ed:95:2a
igb 0000:01:00.2: eth2: PBA No: G10565-011
igb 0000:01:00.3: eth3: (PCIe:5.0Gb/s:Width x4) bc:20:5b:ed:95:2b
igb 0000:01:00.3: eth3: PBA No: G10565-011
igb 0000:44:00.0: eth4: (PCIe:5.0Gb/s:Width x4) a0:26:9f:01:c8:74
igb 0000:44:00.0: eth4: PBA No: G13158-000
igb 0000:44:00.1: eth5: (PCIe:5.0Gb/s:Width x4) a0:26:9f:01:c8:75
igb 0000:44:00.1: eth5: PBA No: G13158-000
igb 0000:44:00.2: eth6: (PCIe:5.0Gb/s:Width x4) a0:26:9f:01:c8:76
igb 0000:44:00.2: eth6: PBA No: G13158-000
igb 0000:44:00.3: eth7: (PCIe:5.0Gb/s:Width x4) a0:26:9f:01:c8:77
igb 0000:44:00.3: eth7: PBA No: G13158-000

# vi ifcfg-em1/2/3/4 and vi ifcfg-p2p1/2/3/4 (if any) to modify mac address and ip address line

c. service restart network to take affect without reboot

Note: you don't have to reboot to make networking work.
# rmmod igb
# modprobe igb
Note: it will update /etc/udev/rules.d/70-persistent-net.rules
# service network restart

3.  rebuild raid 10 on db02

You can now insert the remaining 3 hard disks into slot ID 1,3,5 on db02.

It won't rebuild RAID10 automatically by itself after insert.

Now check Open manage page at https://192.168.0.2:1311 (new ip address for db02 after change)

all the hard disks in slot ID 1,3,5 shows 'foreign' disks.

You need to clear 'foreign' status by doing the following:  refer to http://en.community.dell.com/support-forums/servers/f/906/t/19299553.aspx


You will need to clear the foreign configuration, reconfigure the drive as hotspare and the rebuild will start.

To clear foreign configuration:  select the controller on OpenManage, go to the "Information/Configuration" tab, on "Controller Tasks" select "Foreign Configiration Operations", click "Execute".  On the next page, click "Clear"


After clearing 'foreign', the status becomes 'ready', it's still not rebuilding RAID10.

To reconfigure as hot spare: go to the "Physical Disk" view, on the "Available Task" for that drive select "Assign Global Hot Spare". then it will start rebuilding RAID10.

4. reboot db01 and db02 to confirm it's okay.