DBA Sensation

January 31, 2013

SQL 2008 Active-active cluster on windows 2008 R2 Share storage migration

Filed under: 3. MS SQL Server — Tags: , , , , , — zhefeng @ 11:16 am

This is my POC doc before we migrate our production sql 2008 cluster to new storage. it has two parts: first part simply install the cluster on 2 ESX VM images by using vmfs shared disks. Second part is the storage migration part.

====Part 1: build sql 2008 cluster

IP Address mapping:
10.165.36.78 vmsql2008cls01
10.165.36.79 vmsql2008cls02
10.165.36.80 vmsql2008cls
10.165.36.81 vmsql2k8clsvip1
10.165.36.82 vmsql2k8clsvip2
10.165.36.83 vmsql2008clsdtc

Initializing the LUN and labelling them:
G: OLTP (OLTP data volume)
H: OLAP (OLAP data volume)
I: Quorum (Quorum Disk)
R: msdts

1. enable “application server” roles in OS “Server Manager”
check “.NET Framework 3.5.1”, “Distributed Transactions – Incoming Remote Transactions”, “Distributed Transactions – Outgoing Remote Transactions”

2. Add “failover clustering” feature

3. setup cluster in cluster manager, cluster IP/name: vmsql2008cls 10.165.36.80

4. setup clustered MSDTC service
Open Cluster manaager, In “services and applications” add “DTC” service, use name “vmsql2008clsdtc”, ip “10.165.36.83”

5. Install first instance. Run under command prompt: Setup.exe /ACTION=PrepareFailoverCluster (note: you have to use this setup option and run it on BOTH nodes)

1). pickup instance features and shared features (except reporting services)

2). Default instance id: MSSQLSERVER

3). Cluster Security Policy – “Use service SIDs”

4). Server Configuration – use the same account for all sql server services “pgdev\sevice.pg.prod”, keep all the default startup type (especially for dtabase engine “manual”)

6. Run “setup” to finish cluster on ONLY active node (owns the storage)
Run under command prompt: setup.exe /ACTION=CompleteFailoverCluster
1). SQL Server Network Name for first instance: vmsql2k8clsvip1
2). pickup “OLTP” disk
3). ip address for virtual: 10.165.36.81
4). done with installation. Varify it in cluster admin

7. Install 2nd instance.
1). run Setup.exe /ACTION=PrepareFailoverCluster on BOTH nodes
2). run setup.exe /ACTION=CompleteFailoverCluster on ONLY active node
3). virtual IP and network name:
10.165.36.82 vmsql2k8clsvip2
4). pick up “OLAP” disk
5). done with installation

====Part 2: migrating from old LUNs to new LUNs

8. Add new disks on Node1 (V, W, X, Y) and plan their mapping drives:
V -> G
W -> H
X -> I
Y -> R

9. in cluster manager, “add disk” to add all new disks to storage catalog

10. move the new disks to corresponding service or applications except Quorum disk (right click on disk, “more actions” -> “move this resource to another service or application”)

11. Stop your clustered application(s) (the virtual SQL instance in this case) and copy data from old drives to new drives (include msdtc drive)
1). In Failover Cluster Management, take all SQL instances and MSDTC service offline. This will stop SQL services and release any open handles on the SQL data files so we can copy the data to the new drive as below order:
G -> V
H -> W
R -> Y

2). When the data copy is complete, change the drive letter of an existing data drive to a temporary letter. Set the drive letter of the new drive to the original drive letter of your original storage. (If the original data drive was G:, the new drive should be G: or SQL will not be able to find its files and thus will not start).
So set old ones to temp drive letter first:
G -> O
H -> P
R -> Q

12. Then Map new drives to old drive letters:
V -> G
W -> H
Y -> R

13. change the dependency for each application and service. For example, for first instance ‘sql server’, right click “properties” -> dependencies -> change to “cluster disk 5” (the new disk)

14. delete MSDTC service and recreate
1). delete the resource “MSDTC” in dtc service catalog
2). add resource “MSDTC” back to catalog
right click on dtc sevice icon on the left -> add a resource > more resource -> “2- Add Distribute Transaction Coordinator”
3). add dependency for the new dtc service
right click on “microsoft Distributed Transaction Coordinator” -> ” properties” -> “Dependencies” -> add virtual server name And new cluster disk as depenndencies for this service.
4). bring it online.

15. Move quorum
1). right click on the cluster name under “failover cluster manager” -> “more actions” -> “configure cluster quorum settings”
2). choose default Quorum configuration ” Node and disk Majority”
3). check new cluster disk for “storage witness”,
4). confirm and done with reconfigure
5). no need to change the drive letter since it’s quorum, so the new quorum is “X”

16. bring all old cluster disk offline
that’s: O, P, Q, I

17. bring all clustered application(s) online

18. Verify services and applications are online and functional, and failover works

19. delete old disks in cluster manager, delete partitions as well.

Reference:
1. How to Move a Windows 2008 Cluster to New SAN Storage
http://www.systemcentercentral.com/BlogDetails/tabid/143/IndexID/54853/Default.aspx

Advertisements

September 7, 2010

Oracle 10g ASM/RAW storage migration

Filed under: [RAC] — Tags: , , , , , , — zhefeng @ 9:47 am

Objective:
we want to migrate the whole shared storage from old SAN to new SAN without re-installing the whole Oracle RAC

Scenario:
1.Current structure
[Nodes]
## eth1-Public
10.0.0.101 vmrac01 vmrac01.test.com
10.0.0.102 vmrac02 vmrac02.test.com
## eth0-Private
192.168.199.1 vmracprv01 vmracprv01.test.com
192.168.199.2 vmracprv02 vmracprv02.test.com
## VIP
10.0.0.103 vmracvip01 vmracvip01.test.com
10.0.0.104 vmracvip02 vmracvip02.test.com

[Storage]
Both ORACLE_HOME are local:
ORACLE_HOME=/database/oracle/10grac/db
CRS_HOME=/database/oracle/10grac/crs

Shared LUN display (3 partitions, 2*256M for OCR&VOTING, 1*20G for ASM)
Disk /dev/sdb: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdb1 1 32 257008+ 83 Linux
/dev/sdb2 33 64 257040 83 Linux
/dev/sdb3 65 2610 20450745 83 Linux

OCR and Voting are on RAW device: /dev/sdb1 /dev/sdb2

ASM disks
bash-3.1$ export ORACLE_SID=+ASM1
bash-3.1$ asmcmd
ASMCMD> lsdg
State Type Rebal Unbal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Name
MOUNTED EXTERN N N 512 4096 1048576 19971 17925 0 17925 0 DG1/

2. New storage (sdc 10G)
1). new LUN added
[root@vmrac01 bin]# fdisk -l

Disk /dev/sda: 26.8 GB, 26843545600 bytes
255 heads, 63 sectors/track, 3263 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux
/dev/sda2 14 535 4192965 82 Linux swap / Solaris
/dev/sda3 536 3263 21912660 83 Linux

Disk /dev/sdb: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdb1 1 32 257008+ 83 Linux
/dev/sdb2 33 64 257040 83 Linux
/dev/sdb3 65 2610 20450745 83 Linux

Disk /dev/sdc: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

2). Partition the new LUN to 3 partitions
Disk /dev/sdc: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdc1 1 32 257008+ 83 Linux
/dev/sdc2 33 64 257040 83 Linux
/dev/sdc3 65 1305 9968332+ 83 Linux

3). clone data from previous raw disks
**shutdown db and crs first to make sure there is no change for raw disks!
#dd if=/dev/raw/raw1 of=/dev/sdc1
514017+0 records in
514017+0 records out
263176704 bytes (263 MB) copied, 252.812 seconds, 1.0 MB/s

#dd if=/dev/raw/raw2 of=/dev/sdc2
514080+0 records in
514080+0 records out
263208960 bytes (263 MB) copied, 267.868 seconds, 983 kB/s

4).”cheating” the Oracle by re-binding to new device on both nodes
**old binding
Step1: add entries to /etc/udev/rules.d/60-raw.rules
ACTION==”add”, KERNEL==”sdb1″, RUN+=”/bin/raw /dev/raw/raw1 %N”
ACTION==”add”, KERNEL==”sdb2″, RUN+=”/bin/raw /dev/raw/raw2 %N”

Step2: For the mapping to have immediate effect, run below command
#raw /dev/raw/raw1 /dev/sdb1
#raw /dev/raw/raw2 /dev/sdb2

Step3: Run the following commands and add them the /etc/rc.local file.
#chown oracle:dba /dev/raw/raw1
#chown oracle:dba /dev/raw/raw2
#chmod 660 /dev/raw/raw1
#chmod 660 /dev/raw/raw2
#chown oracle:dba /dev/sdb1
#chown oracle:dba /dev/sdb2
#chmod 660 /dev/sdb1
#chmod 660 /dev/sdb2

**new binding on both node
Step1: editing /etc/udev/rules.d/60-raw.rules
ACTION==”add”, KERNEL==”sdc1″, RUN+=”/bin/raw /dev/raw/raw1 %N”
ACTION==”add”, KERNEL==”sdc2″, RUN+=”/bin/raw /dev/raw/raw2 %N”

Step2: mapping immediately
#raw /dev/raw/raw1 /dev/sdc1
#raw /dev/raw/raw2 /dev/sdc2

Step3:permission and edit /etc/rc.local
#chown oracle:dba /dev/raw/raw1
#chown oracle:dba /dev/raw/raw2
#chmod 660 /dev/raw/raw1
#chmod 660 /dev/raw/raw2
#chown oracle:dba /dev/sdc1
#chown oracle:dba /dev/sdc2
#chmod 660 /dev/sdc1
#chmod 660 /dev/sdc2

5). startup crs and oracle db, check the database, everything works fine after switching the raw disks!

3. ASM disk group migration
1). Mark the new disk sdc3 on one node
# /etc/init.d/oracleasm createdisk VOL2 /dev/sdc3
Marking disk “/dev/sdc3” as an ASM disk: [ OK ]

2). scan disk on the other node
[root@vanpgvmrac02 bin]# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks: [ OK ]

3). now verify the new disk was marked on both node
[root@vmrac01 disks]# /etc/init.d/oracleasm listdisks
VOL1
VOL2

[root@vmrac02 bin]# /etc/init.d/oracleasm listdisks
VOL1
VOL2

4). add new disk to DISKGROUP (under asm instance)
$export ORACLE_SID=+ASM1
$sqlplus / as sysdba
sql>alter diskgroup DG1 add disk VOL2
–wait rebalancing
sql>select * from v$asm_operation

5). remove old disk from DISKGROUP
sql>alter diskgroup DG1 drop disk VOL1
–wait until rebalancing finished
sql>select * from v$asm_operation
GROUP_NUMBER OPERATION STATE POWER ACTUAL SOFAR
———— ————— ———— ———- ———- ———-
EST_WORK EST_RATE EST_MINUTES
———- ———- ———–
1 REBAL RUN 1 1 2
1374 30 45

6). verify the database and asm, everything is ok!

7). clean-up the old disk confiruations
[root@vmrac01 bin]# /etc/init.d/oracleasm deletedisk VOL1
Removing ASM disk “VOL1”: [ OK ]
[root@vmrac01 bin]# /etc/init.d/oracleasm listdisks
VOL2

[root@vmrac02 ~]# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks: [ OK ]
[root@vmrac02 ~]# /etc/init.d/oracleasm listdisks
VOL2

8). wipe-off the partitions for sdb.

Reference:
1. Exact Steps To Migrate ASM Diskgroups To Another SAN Without Downtime. [ID 837308.1]
2. Previous doc “VMRAC installation” task 130.2008.09.12
3. OCR / Vote disk Maintenance Operations: (ADD/REMOVE/REPLACE/MOVE), including moving from RAW Devices to Block Devices. [ID 428681.1]
4. ASM using ASMLib and Raw Devices
http://www.oracle-base.com/articles/10g/ASMUsingASMLibAndRawDevices.php

Create a free website or blog at WordPress.com.