DBA Sensation

January 31, 2013

SQL 2008 Active-active cluster on windows 2008 R2 Share storage migration

Filed under: 3. MS SQL Server — Tags: , , , , , — zhefeng @ 11:16 am

This is my POC doc before we migrate our production sql 2008 cluster to new storage. it has two parts: first part simply install the cluster on 2 ESX VM images by using vmfs shared disks. Second part is the storage migration part.

====Part 1: build sql 2008 cluster

IP Address mapping: vmsql2008cls01 vmsql2008cls02 vmsql2008cls vmsql2k8clsvip1 vmsql2k8clsvip2 vmsql2008clsdtc

Initializing the LUN and labelling them:
G: OLTP (OLTP data volume)
H: OLAP (OLAP data volume)
I: Quorum (Quorum Disk)
R: msdts

1. enable “application server” roles in OS “Server Manager”
check “.NET Framework 3.5.1”, “Distributed Transactions – Incoming Remote Transactions”, “Distributed Transactions – Outgoing Remote Transactions”

2. Add “failover clustering” feature

3. setup cluster in cluster manager, cluster IP/name: vmsql2008cls

4. setup clustered MSDTC service
Open Cluster manaager, In “services and applications” add “DTC” service, use name “vmsql2008clsdtc”, ip “”

5. Install first instance. Run under command prompt: Setup.exe /ACTION=PrepareFailoverCluster (note: you have to use this setup option and run it on BOTH nodes)

1). pickup instance features and shared features (except reporting services)

2). Default instance id: MSSQLSERVER

3). Cluster Security Policy – “Use service SIDs”

4). Server Configuration – use the same account for all sql server services “pgdev\sevice.pg.prod”, keep all the default startup type (especially for dtabase engine “manual”)

6. Run “setup” to finish cluster on ONLY active node (owns the storage)
Run under command prompt: setup.exe /ACTION=CompleteFailoverCluster
1). SQL Server Network Name for first instance: vmsql2k8clsvip1
2). pickup “OLTP” disk
3). ip address for virtual:
4). done with installation. Varify it in cluster admin

7. Install 2nd instance.
1). run Setup.exe /ACTION=PrepareFailoverCluster on BOTH nodes
2). run setup.exe /ACTION=CompleteFailoverCluster on ONLY active node
3). virtual IP and network name: vmsql2k8clsvip2
4). pick up “OLAP” disk
5). done with installation

====Part 2: migrating from old LUNs to new LUNs

8. Add new disks on Node1 (V, W, X, Y) and plan their mapping drives:
V -> G
W -> H
X -> I
Y -> R

9. in cluster manager, “add disk” to add all new disks to storage catalog

10. move the new disks to corresponding service or applications except Quorum disk (right click on disk, “more actions” -> “move this resource to another service or application”)

11. Stop your clustered application(s) (the virtual SQL instance in this case) and copy data from old drives to new drives (include msdtc drive)
1). In Failover Cluster Management, take all SQL instances and MSDTC service offline. This will stop SQL services and release any open handles on the SQL data files so we can copy the data to the new drive as below order:
G -> V
H -> W
R -> Y

2). When the data copy is complete, change the drive letter of an existing data drive to a temporary letter. Set the drive letter of the new drive to the original drive letter of your original storage. (If the original data drive was G:, the new drive should be G: or SQL will not be able to find its files and thus will not start).
So set old ones to temp drive letter first:
G -> O
H -> P
R -> Q

12. Then Map new drives to old drive letters:
V -> G
W -> H
Y -> R

13. change the dependency for each application and service. For example, for first instance ‘sql server’, right click “properties” -> dependencies -> change to “cluster disk 5” (the new disk)

14. delete MSDTC service and recreate
1). delete the resource “MSDTC” in dtc service catalog
2). add resource “MSDTC” back to catalog
right click on dtc sevice icon on the left -> add a resource > more resource -> “2- Add Distribute Transaction Coordinator”
3). add dependency for the new dtc service
right click on “microsoft Distributed Transaction Coordinator” -> ” properties” -> “Dependencies” -> add virtual server name And new cluster disk as depenndencies for this service.
4). bring it online.

15. Move quorum
1). right click on the cluster name under “failover cluster manager” -> “more actions” -> “configure cluster quorum settings”
2). choose default Quorum configuration ” Node and disk Majority”
3). check new cluster disk for “storage witness”,
4). confirm and done with reconfigure
5). no need to change the drive letter since it’s quorum, so the new quorum is “X”

16. bring all old cluster disk offline
that’s: O, P, Q, I

17. bring all clustered application(s) online

18. Verify services and applications are online and functional, and failover works

19. delete old disks in cluster manager, delete partitions as well.

1. How to Move a Windows 2008 Cluster to New SAN Storage

September 7, 2010

Oracle 10g ASM/RAW storage migration

Filed under: [RAC] — Tags: , , , , , , — zhefeng @ 9:47 am

we want to migrate the whole shared storage from old SAN to new SAN without re-installing the whole Oracle RAC

1.Current structure
## eth1-Public vmrac01 vmrac01.test.com vmrac02 vmrac02.test.com
## eth0-Private vmracprv01 vmracprv01.test.com vmracprv02 vmracprv02.test.com
## VIP vmracvip01 vmracvip01.test.com vmracvip02 vmracvip02.test.com

Both ORACLE_HOME are local:

Shared LUN display (3 partitions, 2*256M for OCR&VOTING, 1*20G for ASM)
Disk /dev/sdb: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdb1 1 32 257008+ 83 Linux
/dev/sdb2 33 64 257040 83 Linux
/dev/sdb3 65 2610 20450745 83 Linux

OCR and Voting are on RAW device: /dev/sdb1 /dev/sdb2

ASM disks
bash-3.1$ export ORACLE_SID=+ASM1
bash-3.1$ asmcmd
ASMCMD> lsdg
State Type Rebal Unbal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Name
MOUNTED EXTERN N N 512 4096 1048576 19971 17925 0 17925 0 DG1/

2. New storage (sdc 10G)
1). new LUN added
[root@vmrac01 bin]# fdisk -l

Disk /dev/sda: 26.8 GB, 26843545600 bytes
255 heads, 63 sectors/track, 3263 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux
/dev/sda2 14 535 4192965 82 Linux swap / Solaris
/dev/sda3 536 3263 21912660 83 Linux

Disk /dev/sdb: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdb1 1 32 257008+ 83 Linux
/dev/sdb2 33 64 257040 83 Linux
/dev/sdb3 65 2610 20450745 83 Linux

Disk /dev/sdc: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

2). Partition the new LUN to 3 partitions
Disk /dev/sdc: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdc1 1 32 257008+ 83 Linux
/dev/sdc2 33 64 257040 83 Linux
/dev/sdc3 65 1305 9968332+ 83 Linux

3). clone data from previous raw disks
**shutdown db and crs first to make sure there is no change for raw disks!
#dd if=/dev/raw/raw1 of=/dev/sdc1
514017+0 records in
514017+0 records out
263176704 bytes (263 MB) copied, 252.812 seconds, 1.0 MB/s

#dd if=/dev/raw/raw2 of=/dev/sdc2
514080+0 records in
514080+0 records out
263208960 bytes (263 MB) copied, 267.868 seconds, 983 kB/s

4).”cheating” the Oracle by re-binding to new device on both nodes
**old binding
Step1: add entries to /etc/udev/rules.d/60-raw.rules
ACTION==”add”, KERNEL==”sdb1″, RUN+=”/bin/raw /dev/raw/raw1 %N”
ACTION==”add”, KERNEL==”sdb2″, RUN+=”/bin/raw /dev/raw/raw2 %N”

Step2: For the mapping to have immediate effect, run below command
#raw /dev/raw/raw1 /dev/sdb1
#raw /dev/raw/raw2 /dev/sdb2

Step3: Run the following commands and add them the /etc/rc.local file.
#chown oracle:dba /dev/raw/raw1
#chown oracle:dba /dev/raw/raw2
#chmod 660 /dev/raw/raw1
#chmod 660 /dev/raw/raw2
#chown oracle:dba /dev/sdb1
#chown oracle:dba /dev/sdb2
#chmod 660 /dev/sdb1
#chmod 660 /dev/sdb2

**new binding on both node
Step1: editing /etc/udev/rules.d/60-raw.rules
ACTION==”add”, KERNEL==”sdc1″, RUN+=”/bin/raw /dev/raw/raw1 %N”
ACTION==”add”, KERNEL==”sdc2″, RUN+=”/bin/raw /dev/raw/raw2 %N”

Step2: mapping immediately
#raw /dev/raw/raw1 /dev/sdc1
#raw /dev/raw/raw2 /dev/sdc2

Step3:permission and edit /etc/rc.local
#chown oracle:dba /dev/raw/raw1
#chown oracle:dba /dev/raw/raw2
#chmod 660 /dev/raw/raw1
#chmod 660 /dev/raw/raw2
#chown oracle:dba /dev/sdc1
#chown oracle:dba /dev/sdc2
#chmod 660 /dev/sdc1
#chmod 660 /dev/sdc2

5). startup crs and oracle db, check the database, everything works fine after switching the raw disks!

3. ASM disk group migration
1). Mark the new disk sdc3 on one node
# /etc/init.d/oracleasm createdisk VOL2 /dev/sdc3
Marking disk “/dev/sdc3” as an ASM disk: [ OK ]

2). scan disk on the other node
[root@vanpgvmrac02 bin]# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks: [ OK ]

3). now verify the new disk was marked on both node
[root@vmrac01 disks]# /etc/init.d/oracleasm listdisks

[root@vmrac02 bin]# /etc/init.d/oracleasm listdisks

4). add new disk to DISKGROUP (under asm instance)
$export ORACLE_SID=+ASM1
$sqlplus / as sysdba
sql>alter diskgroup DG1 add disk VOL2
–wait rebalancing
sql>select * from v$asm_operation

5). remove old disk from DISKGROUP
sql>alter diskgroup DG1 drop disk VOL1
–wait until rebalancing finished
sql>select * from v$asm_operation
———— ————— ———— ———- ———- ———-
———- ———- ———–
1 REBAL RUN 1 1 2
1374 30 45

6). verify the database and asm, everything is ok!

7). clean-up the old disk confiruations
[root@vmrac01 bin]# /etc/init.d/oracleasm deletedisk VOL1
Removing ASM disk “VOL1”: [ OK ]
[root@vmrac01 bin]# /etc/init.d/oracleasm listdisks

[root@vmrac02 ~]# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks: [ OK ]
[root@vmrac02 ~]# /etc/init.d/oracleasm listdisks

8). wipe-off the partitions for sdb.

1. Exact Steps To Migrate ASM Diskgroups To Another SAN Without Downtime. [ID 837308.1]
2. Previous doc “VMRAC installation” task 130.2008.09.12
3. OCR / Vote disk Maintenance Operations: (ADD/REMOVE/REPLACE/MOVE), including moving from RAW Devices to Block Devices. [ID 428681.1]
4. ASM using ASMLib and Raw Devices

March 17, 2010

How to migrate sql cluster between domains

Filed under: 3. MS SQL Server — Tags: , , , — zhefeng @ 4:54 pm

Migrating SQL Server 2005/2000 cluster from one domain to the other domain is not supported by Microsoft. However, here is the way how to do that.
i used this way to migrate our production cluster sucessfully.

1. Current settings (active – passive mode)
Cluster name: clustest
Cluster management IP:
Virtual server Name: clustestvip1.test.com
Service account: test\service.product
Cluster group: test\sqlcluster
Name –dbnode1.test.com
Physical IP:

Name – dbnode2.test.com
Physical IP:

2. Objective
(1). To change the domain from test.com to test1.com
(2). To change the service account from test\service.acc to test1\service.act
(3). Make sure the sql cluster is still functional after changing

3. Testing steps
(1). Take all cluster resources offline (except the quorum, which cannot be taken offline)
(2). Stop the cluster service on both nodes and change startup type to “manual”
(3). Change the domain of each machine to the new domain and reboot
(4). After reboot, on each machine, change the cluster and SQL service accounts to accounts in the new domain (test\service.acc ->test1\service.act).
Note: according to MSKB 915846 it should be “On one of the cluster nodes, use SQL Server Configuration Manager to change the service account to the new account.”
That’s not true, and you got error from there, have to do it on both nodes in services.msc.
(5). Add the cluster and SQL service accounts to the local Adminstrators group.
(6). Run gpedit.msc to grant the following rights to the new service account:
Path: “Local Computer Policy” -> “Windows Settings” -> “Security Settings” -> “Local Policies” ->”User Rights Assignment”
*Act as part of the operating system
*Adjust memory quotas for a process
*Back up files and directories.
*Debug programs
*Increase scheduling priority
*Increase quotas (not in windows 2003).
*Load and unload device drivers.
*Lock pages in memory
*Log on as a batch job
*Log on as a service
*Manage auditing and security log
*Replace a process level token
*Restore files and directories
7). Update the domain groups SID for cluster services (sql 2000 skip this step)
There are domain groups related to old domain name as below which couldn’t be changed in AD(you can’t find them! They only appear in register as SID at this branch: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.X\Setup).
So what we need to do is just find the old group sid in registry and replace it with new group sid. Here are the steps:
a. find old cluster group sid by logging in as old service account, then run “whoami /groups” in dos prompt
b. in this case, the service account is service.acc, and cluster group is : test\sqlcluster
c. find the sid for test\sqlcluster is:
d. same way to find the new group sid for test1\sqlcluster is:
e: replace the old sid with new sid under branch (do a small search there) : HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\
8). Reboot both nodes
9). Start the cluster service on both machines (put them back to “automatic” mode again)
10). Bring cluster resources online

1. Changing the Domain for a SQL 2005 Cluster

2. How to move a Windows Server cluster from one domain to another

3. Best practices that you can use to set up domain groups and solutions to problems that may occur when you set up a domain group when you install a SQL Server 2005 failover cluster

Blog at WordPress.com.