DBA Sensation

August 15, 2012

Use RMAN to restore/recover the database to remote host

Filed under: [backup and recovery] — Tags: , , , — zhefeng @ 10:10 am

0. background
The purpose for this testing is to try restore/recover the database to remote host with rman backup
i am using 1 vmware linux box, single ORCL oracle instance on ASM storage (to make things more complicated:))
On the source db, i have a user “jehan”, a table “test” , with 3 rows as below:
SQL> select * from jehan.test;


1. backup source db
[oracle@myrh5 trace]$ rman target /
Recovery Manager: Release – Production on Tue Aug 14 17:56:25 2012
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
connected to target database: ORCL (DBID=1270514474)

RMAN> backup database;

Starting backup at 14-AUG-12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=155 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00007 name=+DATA/orcl/ds01.dbf
input datafile file number=00002 name=+DATA/orcl/sysaux01.dbf
input datafile file number=00001 name=+DATA/orcl/system01.dbf
input datafile file number=00006 name=+DATA/orcl/cms01.dbf
input datafile file number=00003 name=+DATA/orcl/undotbs01.dbf
input datafile file number=00005 name=+DATA/orcl/example01.dbf
input datafile file number=00004 name=+DATA/orcl/users01.dbf
channel ORA_DISK_1: starting piece 1 at 14-AUG-12
channel ORA_DISK_1: finished piece 1 at 14-AUG-12
piece handle=+DATA/orcl/backupset/2012_08_14/nnndf0_tag20120814t175642_0.387.791315803 tag=TAG20120814T175642 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:01:45
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 14-AUG-12
channel ORA_DISK_1: finished piece 1 at 14-AUG-12
piece handle=+DATA/orcl/backupset/2012_08_14/ncsnf0_tag20120814t175642_0.386.791315911 tag=TAG20120814T175642 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 14-AUG-12

2. delete source db “ORCL” in dbca

3. startup target database with no mount by using default init.ora file (note, you don’t need to have target db created beforehand)
[oracle@myrh5 trace]$ export ORACLE_SID=ORCL
[oracle@myrh5 trace]$ rman target /

Recovery Manager: Release – Production on Wed Aug 15 09:51:08 2012

Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.

connected to target database (not started)

RMAN> set dbid 1270514474;
RMAN> startup nomount

startup failed: ORA-01078: failure in processing system parameters
LRM-00109: could not open parameter file ‘/home/u01/app/oracle/product/11.2.0/dbhome_1/dbs/initORCL.ora’

starting Oracle instance without parameter file for retrieval of spfile
Oracle instance started

Total System Global Area 158662656 bytes

Fixed Size 2211448 bytes
Variable Size 92275080 bytes
Database Buffers 58720256 bytes
Redo Buffers 5455872 bytes

4. restore the spfile from backup to pfile
RMAN> restore spfile to pfile ‘/home/u01/app/oracle/product/11.2.0/dbhome_1/dbs/initORCL.ora’ from ‘+DATA/orcl/backupset/2012_08_14/ncsnf0_tag20120814t175642_0.386.791315911′;
Starting restore at 15-AUG-12
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=98 device type=DISK

channel ORA_DISK_1: restoring spfile from AUTOBACKUP +DATA/orcl/backupset/2012_08_14/ncsnf0_tag20120814t175642_0.386.791315911
channel ORA_DISK_1: SPFILE restore from AUTOBACKUP complete
Finished restore at 15-AUG-12

5. make the path for auditing according to pfile parameters (*.audit_file_dest=’/home/u01/app/oracle/admin/orcl/adump’), you have to do this otherwise the rman can’t start database with nomount
mkdir -p /home/u01/app/oracle/admin/orcl/adump

Note: also if you want to put the control file in different path, modify the pfile for the paths now

6. now start database with no mount with pfile (which will provide the correct control file location)
RMAN> startup nomount pfile=’?/dbs/initORCL.ora’;

7. now restore the controlfile
RMAN>restore controlfile from ‘+DATA/orcl/backupset/2012_08_14/ncsnf0_tag20120814t175642_0.386.791315911’;
Starting restore at 15-AUG-12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=11 device type=DISK

channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:03
output file name=+DATA/orcl/control01.ctl
output file name=+DATA/orcl/control02.ctl
Finished restore at 15-AUG-12

8. now we have the control files, we can start start database mount
RMAN> set dbid 1270514474;

executing command: SET DBID

RMAN> alter database mount;

database mounted
released channel: ORA_DISK_1

9. Let the restore begin!
RMAN> restore database;

Starting restore at 15-AUG-12
Starting implicit crosscheck backup at 15-AUG-12
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=11 device type=DISK
Crosschecked 4 objects
Finished implicit crosscheck backup at 15-AUG-12

Starting implicit crosscheck copy at 15-AUG-12
using channel ORA_DISK_1
Finished implicit crosscheck copy at 15-AUG-12

searching for all files in the recovery area
cataloging files…
cataloging done

List of Cataloged Files
File Name: +data/ORCL/archivelog/2012_08_14/thread_1_seq_315.385.791317055
File Name: +data/ORCL/archivelog/2012_08_14/thread_1_seq_316.384.791317069
File Name: +data/ORCL/archivelog/2012_08_14/thread_1_seq_317.383.791317085
File Name: +data/ORCL/archivelog/2012_08_14/thread_1_seq_318.382.791317099
File Name: +data/ORCL/archivelog/2012_08_14/thread_1_seq_319.381.791317113
File Name: +data/ORCL/archivelog/2012_08_14/thread_1_seq_320.380.791317127
File Name: +data/ORCL/archivelog/2012_08_14/thread_1_seq_321.379.791317139
File Name: +data/ORCL/archivelog/2012_08_14/thread_1_seq_322.378.791317153
File Name: +data/ORCL/archivelog/2012_08_14/thread_1_seq_323.377.791317171
File Name: +data/ORCL/archivelog/2012_08_14/thread_1_seq_324.376.791334027
File Name: +data/ORCL/BACKUPSET/2012_08_14/ncsnf0_TAG20120814T175642_0.386.791315911

using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to +DATA/orcl/system01.dbf
channel ORA_DISK_1: restoring datafile 00002 to +DATA/orcl/sysaux01.dbf
channel ORA_DISK_1: restoring datafile 00003 to +DATA/orcl/undotbs01.dbf
channel ORA_DISK_1: restoring datafile 00004 to +DATA/orcl/users01.dbf
channel ORA_DISK_1: restoring datafile 00005 to +DATA/orcl/example01.dbf
channel ORA_DISK_1: restoring datafile 00006 to +DATA/orcl/cms01.dbf
channel ORA_DISK_1: restoring datafile 00007 to +DATA/orcl/ds01.dbf
channel ORA_DISK_1: reading from backup piece +DATA/orcl/backupset/2012_08_14/nnndf0_tag20120814t175642_0.387.791315803
channel ORA_DISK_1: piece handle=+DATA/orcl/backupset/2012_08_14/nnndf0_tag20120814t175642_0.387.791315803 tag=TAG20120814T175642
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:03:25
Finished restore at 15-AUG-12

Note: if you want to restore the datafile to different location, have to do the path mapping like this (not for temp tablespace datafile), after restore done, also run “SWITCH DATAFILE ALL;” for updating rman catalog in control file:
set newname for datafile 1 to “/u01/oradata/system01.dbf”;
set newname for datafile 2 to “/u01/oradata/sysaux01.dbf”;

10. recover the database
RMAN> recover database;

Starting recover at 15-AUG-12
using channel ORA_DISK_1

starting media recovery

archived log for thread 1 with sequence 315 is already on disk as file +DATA/orcl/archivelog/2012_08_14/thread_1_seq_315.385.791317055
archived log for thread 1 with sequence 316 is already on disk as file +DATA/orcl/archivelog/2012_08_14/thread_1_seq_316.384.791317069
archived log for thread 1 with sequence 317 is already on disk as file +DATA/orcl/archivelog/2012_08_14/thread_1_seq_317.383.791317085
archived log for thread 1 with sequence 318 is already on disk as file +DATA/orcl/archivelog/2012_08_14/thread_1_seq_318.382.791317099
archived log for thread 1 with sequence 319 is already on disk as file +DATA/orcl/archivelog/2012_08_14/thread_1_seq_319.381.791317113
archived log for thread 1 with sequence 320 is already on disk as file +DATA/orcl/archivelog/2012_08_14/thread_1_seq_320.380.791317127
archived log for thread 1 with sequence 321 is already on disk as file +DATA/orcl/archivelog/2012_08_14/thread_1_seq_321.379.791317139
archived log for thread 1 with sequence 322 is already on disk as file +DATA/orcl/archivelog/2012_08_14/thread_1_seq_322.378.791317153
archived log for thread 1 with sequence 323 is already on disk as file +DATA/orcl/archivelog/2012_08_14/thread_1_seq_323.377.791317171
archived log for thread 1 with sequence 324 is already on disk as file +DATA/orcl/archivelog/2012_08_14/thread_1_seq_324.376.791334027
archived log file name=+DATA/orcl/archivelog/2012_08_14/thread_1_seq_315.385.791317055 thread=1 sequence=315
archived log file name=+DATA/orcl/archivelog/2012_08_14/thread_1_seq_316.384.791317069 thread=1 sequence=316
archived log file name=+DATA/orcl/archivelog/2012_08_14/thread_1_seq_317.383.791317085 thread=1 sequence=317
archived log file name=+DATA/orcl/archivelog/2012_08_14/thread_1_seq_318.382.791317099 thread=1 sequence=318
archived log file name=+DATA/orcl/archivelog/2012_08_14/thread_1_seq_319.381.791317113 thread=1 sequence=319
archived log file name=+DATA/orcl/archivelog/2012_08_14/thread_1_seq_320.380.791317127 thread=1 sequence=320
archived log file name=+DATA/orcl/archivelog/2012_08_14/thread_1_seq_321.379.791317139 thread=1 sequence=321
archived log file name=+DATA/orcl/archivelog/2012_08_14/thread_1_seq_322.378.791317153 thread=1 sequence=322
archived log file name=+DATA/orcl/archivelog/2012_08_14/thread_1_seq_323.377.791317171 thread=1 sequence=323
archived log file name=+DATA/orcl/archivelog/2012_08_14/thread_1_seq_324.376.791334027 thread=1 sequence=324
unable to find archived log
archived log thread=1 sequence=325
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 08/15/2012 10:40:17
RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 325 and starting SCN of 7484710

Note: the last error is fine, before we mount the database, we can use:
alter database mount;
set until scn or set until time
to specify the scn to avoid this error.

11. open database
RMAN> alter database open resetlogs;

database opened

Note: from 11gR2, after open database resetlogs, system will automatically create online redo log file and temp datafile.

12. Verify the data
[oracle@myrh5 trace]$ sqlplus / as sysdba

SQL*Plus: Release Production on Wed Aug 15 10:47:53 2012

Copyright (c) 1982, 2009, Oracle. All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release – 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options

SQL> select * from jehan.test;


April 12, 2012

Set Oracle SGA > 256GB

Filed under: [Installation] — Tags: , , — zhefeng @ 2:06 pm

I had a installation request for installing Oracle 11gR2 on a 2TB memory server. The installation failed on DBCA with complains about can’t reach shared memory.

Check the metalink didn’t find any solution. My colleague told me he was having the same issue before. Oracle told him to set SGA less than 256 GB as a “workaround”.

I followed “workaround” and continued my installation. Later I did some research and I found this:



Checking the swap and the kernel parameters, everything was adjusted as per recommended by oracle, investigating the issue further, seems that This is caused by the prelink command. It calculates shared library load addresses, and updates the shared libraries with them. Simplest thing to do is to undo what prelink did, and disable it.
prelink -ua
sed -i ‘s/PRELINKING=yes/PRELINKING=no/’ /etc/sysconfig/prelink


From: https://support.oracle.com/CSP/ui/flash.html#tab=KBHome%28page=KBHome&id=%28%29%29,%28page=KBNavigator&id=%28bmDocTitle=Why%20not%20able%20to%20allocate%20a%20more%20SGA%20than%20193G%20on%20Linux%2064?&from=BOOKMARK&bmDocType=HOWTO&bmDocID=1241284.1&viewingMode=1143&bmDocDsrc=KB%29%29

Doc ID: 1241284.1

I haven’t tried it yet. anyone is having the same problem can give a try and let me know.

March 2, 2011

Recreating spfile on ASM storage from pfile

Filed under: [backup and recovery] — Tags: , , , — zhefeng @ 2:46 pm

Sometimes when you strewed up with parameters, you need to use the pfile as stepstone to undo the changes in spfile. How does it happen if your spfile sits on ASM storage? Here is an workaround.

1. try to screw up the db parameters
SQL> show parameter memory

———————————— ———– ——————————
hi_shared_memory_address integer 0
memory_max_target big integer 1520M
memory_target big integer 1520M
shared_memory_address integer 0
SQL> alter system set memory_max_target=0 scope=spfile;
System altered.

2. now bounce the instance, db will complain about the new settings
SQL> shutdown
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORA-01078: failure in processing system parameters
ORA-00837: Specified value of MEMORY_TARGET greater than MEMORY_MAX_TARGET

3. in my case the spfile sits on ASM
ASMCMD> ls -l spfile*
Type Redund Striped Time Sys Name
N spfileorcl.ora => +DATA/ORCL/PARAMETERFILE/spfile.267.744731331

4. what we need to do is creating a pfile from spfile then modify parameter back to valid value, then start db from pfile
1). With db not up, we can create pfile from spfile:
SQL> create pfile from spfile=’+DATA/orcl/spfileorcl.ora’;
2). modify the value in pfile ‘initorcl.ora’
$ vi initorcl.ora
3). startup db with pfile
SQL>startup mount –now it will use the pfile

5. create the new spfile to ASM storage from “good” pfile
SQL> create spfile=’+DATA/ORCL/spfileorcl.ora’ from pfile;
File created.

6. watch the file name in ASM storage has been changed, which means we just had a new spfile:
ASMCMD> ls -l spfile*
Type Redund Striped Time Sys Name
N spfileorcl.ora => +DATA/ORCL/PARAMETERFILE/spfile.267.744733351

7. now change the pfile back to be the “bootstrap” of correct spfile
$ cat initorcl.ora

8. restart the database, it will pickup the correct spfile again
$ sqlplus / as sysdba
SQL> startup
ORACLE instance started.

Total System Global Area 1586708480 bytes
Fixed Size 2213736 bytes
Variable Size 973080728 bytes
Database Buffers 603979776 bytes
Redo Buffers 7434240 bytes
Database mounted.
Database opened.

SQL> show parameter spfile

———————————— ———– ——————————
spfile string +DATA/orcl/spfileorcl.ora

SQL> show parameter memory

———————————— ———– ——————————
hi_shared_memory_address integer 0
memory_max_target big integer 1520M
memory_target big integer 1520M
shared_memory_address integer 0

September 29, 2010

root.sh failed on 2nd node when installing Grid Infrastructure

Filed under: [RAC] — Tags: , , , — zhefeng @ 12:39 pm

when i was running root.sh for the last step of grid infra installation on second node, it failed (it was success on 1st node):
root.sh failed on second node with following errors
DiskGroup DATA1 creation failed with the following message:
ORA-15018: diskgroup cannot be created
ORA-15072: command requires at least 1 regular failure groups, discovered only 0

Oracle gives the reason: when you are using multipathing storage for ASM, you have to pre-configure the oracleasm file as below:

On all nodes,

1. Modify the /etc/sysconfig/oracleasm with:


2. restart the asmlib by (except 1st node):
# /etc/init.d/oracleasm restart

3. deconfigure the root.sh settings on nodes except 1st node:
$GRID_HOME/crs/install/rootcrs.pl -verbose -deconfig -force

4. Run root.sh again on the 2nd node (or other nodes)

Oracle Metalink Doc:

September 27, 2010

how to deinstall the failed 11gR2 grid infrastructure

Filed under: [RAC] — Tags: , — zhefeng @ 10:39 am

Two parts are involved: first deconfigure, then deinstall

Deconfigure and Reconfigure of Grid Infrastructure Cluster:

Identify cause of root.sh failure by reviewing logs in $GRID_HOME/cfgtoollogs/crsconfig and $GRID_HOME/log, once cause is identified, deconfigure and reconfigure with steps below – please keep in mind that you will need wait till each step finishes successfully before move to next one:

For Step1 and 2, you can skip node(s) on which you didn’t execute root.sh yet.

Step 1: As root, run “$GRID_HOME/crs/install/rootcrs.pl -verbose -deconfig -force” on all nodes, except the last one.

Step 2: As root, run “$GRID_HOME/crs/install/rootcrs.pl -verbose -deconfig -force -lastnode” on last node. This command will zero out OCR and VD disk also.

Step 3: As root, run $GRID_HOME/root.sh on first node

Step 4: As root, run $GRID_HOME/root.sh on all other node(s), except last one.
Step 5: As root, run $GRID_HOME/root.sh on last node.

Deinstall of Grid Infrastructure Cluster:

Case 1: “root.sh” never ran on this cluster, then as grid user, execute $GRID_HOME/deinstall/deinstall

Case 2: “root.sh” already ran, then follow the step below – please keep in mind that you will need wait till each step finishes successfully before move to next one:

Step 1: As root, run “$GRID_HOME/crs/install/rootcrs.pl -verbose -deconfig -force” on all node, except the last one.

Step 2: As root, run “$GRID_HOME/crs/install/rootcrs.pl -verbose -deconfig -force -lastnode” on last node. This command will zero out OCR and VD disk also.

Step 3: As grid user, run $GRID_HOME/deinstall/deinstall

September 7, 2010

Oracle 10g ASM/RAW storage migration

Filed under: [RAC] — Tags: , , , , , , — zhefeng @ 9:47 am

we want to migrate the whole shared storage from old SAN to new SAN without re-installing the whole Oracle RAC

1.Current structure
## eth1-Public vmrac01 vmrac01.test.com vmrac02 vmrac02.test.com
## eth0-Private vmracprv01 vmracprv01.test.com vmracprv02 vmracprv02.test.com
## VIP vmracvip01 vmracvip01.test.com vmracvip02 vmracvip02.test.com

Both ORACLE_HOME are local:

Shared LUN display (3 partitions, 2*256M for OCR&VOTING, 1*20G for ASM)
Disk /dev/sdb: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdb1 1 32 257008+ 83 Linux
/dev/sdb2 33 64 257040 83 Linux
/dev/sdb3 65 2610 20450745 83 Linux

OCR and Voting are on RAW device: /dev/sdb1 /dev/sdb2

ASM disks
bash-3.1$ export ORACLE_SID=+ASM1
bash-3.1$ asmcmd
ASMCMD> lsdg
State Type Rebal Unbal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Name
MOUNTED EXTERN N N 512 4096 1048576 19971 17925 0 17925 0 DG1/

2. New storage (sdc 10G)
1). new LUN added
[root@vmrac01 bin]# fdisk -l

Disk /dev/sda: 26.8 GB, 26843545600 bytes
255 heads, 63 sectors/track, 3263 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux
/dev/sda2 14 535 4192965 82 Linux swap / Solaris
/dev/sda3 536 3263 21912660 83 Linux

Disk /dev/sdb: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdb1 1 32 257008+ 83 Linux
/dev/sdb2 33 64 257040 83 Linux
/dev/sdb3 65 2610 20450745 83 Linux

Disk /dev/sdc: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

2). Partition the new LUN to 3 partitions
Disk /dev/sdc: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdc1 1 32 257008+ 83 Linux
/dev/sdc2 33 64 257040 83 Linux
/dev/sdc3 65 1305 9968332+ 83 Linux

3). clone data from previous raw disks
**shutdown db and crs first to make sure there is no change for raw disks!
#dd if=/dev/raw/raw1 of=/dev/sdc1
514017+0 records in
514017+0 records out
263176704 bytes (263 MB) copied, 252.812 seconds, 1.0 MB/s

#dd if=/dev/raw/raw2 of=/dev/sdc2
514080+0 records in
514080+0 records out
263208960 bytes (263 MB) copied, 267.868 seconds, 983 kB/s

4).”cheating” the Oracle by re-binding to new device on both nodes
**old binding
Step1: add entries to /etc/udev/rules.d/60-raw.rules
ACTION==”add”, KERNEL==”sdb1″, RUN+=”/bin/raw /dev/raw/raw1 %N”
ACTION==”add”, KERNEL==”sdb2″, RUN+=”/bin/raw /dev/raw/raw2 %N”

Step2: For the mapping to have immediate effect, run below command
#raw /dev/raw/raw1 /dev/sdb1
#raw /dev/raw/raw2 /dev/sdb2

Step3: Run the following commands and add them the /etc/rc.local file.
#chown oracle:dba /dev/raw/raw1
#chown oracle:dba /dev/raw/raw2
#chmod 660 /dev/raw/raw1
#chmod 660 /dev/raw/raw2
#chown oracle:dba /dev/sdb1
#chown oracle:dba /dev/sdb2
#chmod 660 /dev/sdb1
#chmod 660 /dev/sdb2

**new binding on both node
Step1: editing /etc/udev/rules.d/60-raw.rules
ACTION==”add”, KERNEL==”sdc1″, RUN+=”/bin/raw /dev/raw/raw1 %N”
ACTION==”add”, KERNEL==”sdc2″, RUN+=”/bin/raw /dev/raw/raw2 %N”

Step2: mapping immediately
#raw /dev/raw/raw1 /dev/sdc1
#raw /dev/raw/raw2 /dev/sdc2

Step3:permission and edit /etc/rc.local
#chown oracle:dba /dev/raw/raw1
#chown oracle:dba /dev/raw/raw2
#chmod 660 /dev/raw/raw1
#chmod 660 /dev/raw/raw2
#chown oracle:dba /dev/sdc1
#chown oracle:dba /dev/sdc2
#chmod 660 /dev/sdc1
#chmod 660 /dev/sdc2

5). startup crs and oracle db, check the database, everything works fine after switching the raw disks!

3. ASM disk group migration
1). Mark the new disk sdc3 on one node
# /etc/init.d/oracleasm createdisk VOL2 /dev/sdc3
Marking disk “/dev/sdc3” as an ASM disk: [ OK ]

2). scan disk on the other node
[root@vanpgvmrac02 bin]# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks: [ OK ]

3). now verify the new disk was marked on both node
[root@vmrac01 disks]# /etc/init.d/oracleasm listdisks

[root@vmrac02 bin]# /etc/init.d/oracleasm listdisks

4). add new disk to DISKGROUP (under asm instance)
$export ORACLE_SID=+ASM1
$sqlplus / as sysdba
sql>alter diskgroup DG1 add disk VOL2
–wait rebalancing
sql>select * from v$asm_operation

5). remove old disk from DISKGROUP
sql>alter diskgroup DG1 drop disk VOL1
–wait until rebalancing finished
sql>select * from v$asm_operation
———— ————— ———— ———- ———- ———-
———- ———- ———–
1 REBAL RUN 1 1 2
1374 30 45

6). verify the database and asm, everything is ok!

7). clean-up the old disk confiruations
[root@vmrac01 bin]# /etc/init.d/oracleasm deletedisk VOL1
Removing ASM disk “VOL1”: [ OK ]
[root@vmrac01 bin]# /etc/init.d/oracleasm listdisks

[root@vmrac02 ~]# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks: [ OK ]
[root@vmrac02 ~]# /etc/init.d/oracleasm listdisks

8). wipe-off the partitions for sdb.

1. Exact Steps To Migrate ASM Diskgroups To Another SAN Without Downtime. [ID 837308.1]
2. Previous doc “VMRAC installation” task 130.2008.09.12
3. OCR / Vote disk Maintenance Operations: (ADD/REMOVE/REPLACE/MOVE), including moving from RAW Devices to Block Devices. [ID 428681.1]
4. ASM using ASMLib and Raw Devices

June 9, 2010

Sth. about checkpoint

Filed under: 1. Oracle, [System Performance tuning] — Tags: , , — zhefeng @ 2:31 pm

reading a article about checkpoint on metalink(Checkpoint Tuning and Troubleshooting Guide [ID 147468.1])

Here are some good points for checkpoint:

Oracle writes the dirty buffers to disk only on certain conditions:
– A shadow process must scan more than one-quarter of the db_block_buffer
– Every three seconds.
– When a checkpoint is produced.

A checkpoint is realized on five types of events:
– At each switch of the redo log files.
– When the delay for LOG_CHECKPOINT_TIMEOUT is reached.
– When the size in bytes corresponding to :
is written on the current redo log file.
– Directly by the ALTER SYSTEM SWITCH LOGFILE command.
– Directly with the ALTER SYSTEM CHECKPOINT command.

During a checkpoint the following occurs:
– The database writer (DBWR) writes all modified database
blocks in the buffer cache back to datafiles,
– Checkpoint process (ckpt) updates the headers of all
the datafiles to indicate when the last checkpoint
occurred (SCN)

May 25, 2010

Can’t compile a stored procedure when it’s locked

Filed under: 1. Oracle, [PL/SQL dev&tuning] — Tags: , , — zhefeng @ 10:25 am

Trying to recompile a procedure causes the application to hang
(ie: SQL*Plus hangs after submitting the statement). Eventually ORA-4021 errors
occur after the timeout (usually 5 minutes). Here is the soluation from metalink:
Note:ID 107756.1

Error: ORA 4021
Text: time-out occurred while waiting to lock object
Cause: While trying to lock a library object, a time-out occurred.
Action: Retry the operation later.

Solution Description

Verify that the package is not locked by another user by selecting from
V$ACCESS view. To do this, run:

SELECT * FROM v$access WHERE object = ”;

Where is the package name (usually in all uppercase). If there is a row
returned, then the package is already locked and cannot be dropped until the
lock is released. Returned from the query above will be the SID that has this
locked. You can then use this to find out which session has obtained the lock.

In some cases, that session might have been killed and will not show up. If
this happens, the lock will not be release immediately. Waiting for PMON to
clean up the lock might take some time. The fastest way to clean up the lock
is to recycle the database instance.

If an ORA-4021 error is not returned and the command continues to hang after
issuing the CREATE OR REPLACE or DROP statment, you will need to do further
analysis see where the hang is occuring. A starting point is to have a
look in v$session_wait, see the referenced NOTE.61552.1 for how to analyze hang
situations in general

Solution Explanation

Consider the following example:

Session 1:

create or replace procedure lockit(secs in number) as
shuttime date;
shuttime := sysdate + secs/(24*60*60);
while sysdate <= shuttime loop
end loop;
show err

— wait 10 minutes

Session 2:
create or replace procedure lockit as

Result: hang and eventually (the timeout is 5 minutes):

create or replace procedure lockit as
ERROR at line 1:
ORA-04021: timeout occurred while waiting to lock object LOCKIT

Session 3:

connect / as sysdba
col owner for a10
col object for a15
select * from v$access where object = 'LOCKIT';

———- ———- ————— ————————

select sid, event from v$session_wait;


———- —————————————————————-
9 null event

12 library cache pin

In the above result, the blocking sid 9 waits for nothing while session 12, the
hanging session, is waiting for event library cache pin.

March 12, 2010

Why Isn’t Oracle Using My Index?!

Filed under: [System Performance tuning] — Tags: , , , — zhefeng @ 4:02 pm

By Jonathan Lewis

The question in the title of this piece is probably the single most frequently occurring question that appears in the Metalink forums and Usenet newsgroups. This article uses a test case that you can rebuild on your own systems to demonstrate the most fundamental issues with how cost-based optimisation works. And at the end of the article, you should be much better equipped to give an answer the next time you hear that dreaded question.

Because of the wide variety of options that are available when installing Oracle, it isn’t usually safe to predict exactly what will happen when someone runs a script that you have dictated to them. But I’m going to risk it, in the hope that your database is a fairly vanilla installation, with the default values for the mostly commonly tweaked parameters. The example has been built and tested on an 8.1.7 database with the db_block_size set to the commonly used value of 8K and the db_file_multiblock_read_count set to the equally commonly used value 8. The results may be a little different under Oracle 9.2

Run the script from Figure 1, which creates a couple of tables, then indexes and analyses them.

create table t1 as
trunc((rownum-1)/15) n1,
trunc((rownum-1)/15) n2,
rpad(‘x’, 215) v1
from all_objects<
where rownum <= 3000;

create table t2 as
mod(rownum,200) n1,
mod(rownum,200) n2,
rpad('x',215) v1
from all_objects
where rownum <= 3000;

create index t1_i1 on t1(N1);
create index t2_i1 on t2(n1);

analyze table t1 compute
analyze table t2 compute

Figure 1: The test data sets.

Once you have got this data in place, you might want to convince yourself that the two sets of data are identical — in particular, that the N1 columns in both data sets have values ranging from 0 to 199, with 15 occurrences of each value. You might try the following check:

select n1, count(*)
from t1
group by n1;

and the matching query against T2 to prove the point.

If you then execute the queries:

select * from t1 where n1 = 45;
select * from t2 where n1 = 45;

You will find that each query returns 15 rows. However if you

set autotrace traceonly explain

you will discover that the two queries have different execution paths.

The query against table T1 uses the index, but the query against table T2 does a full tablescan.

So you have two sets of identical data, with dramatically different access paths for the same query.
What Happened to the Index?

Note: if you've ever come across any of those "magic number" guidelines regarding the use of indexes, e.g., "Oracle will use an index for less than 23 percent, 10 percent, 2 percent (pick number at random) of the data," then you may at this stage begin to doubt their validity. In this example, Oracle has used a tablescan for 15 rows out of 3,000, i.e., for just one half of one percent of the data!

To investigate problems like this, there is one very simple ploy that I always try as the first step: Put in some hints to make Oracle do what I think it ought to be doing, and see if that gives me any clues.

In this case, a simple hint:

/*+ index(t2, t2_i1) */

is sufficient to switch Oracle from the full tablescan to the indexed access path. The three paths with costs (abbreviated to C=nnn) are shown in Figure 2:

select * from t1 where n1 = 45;


select * from t2 where n1 = 45;


select /*+ index(t2 t2_i1) */
from t1
where n1 = 45;


Figure 2: The different queries and their costs.

So why hasn't Oracle used the index by default in for the T2 query? Easy — as the execution plan shows, the cost of doing the tablescan is cheaper than the cost of using the index.
Why is the Tablescan Cheaper?

This, of course, is simply begging the question. Why is the cost of the tablescan cheaper than the cost of using the index?

By looking into this question, you uncover the key mechanisms (and critically erroneous assumptions) of the Cost Based Optimiser.

Let's start by examining the indexes by running the query:

from user_indexes;

The results are given in the table below:
T1 T2
Blevel 1 1
Data block / key 1 15
Leaf block / key 1 1
Clustering factor 96 3000

Note particularly the value for "data blocks per key." This is the number of different blocks in the table that Oracle thinks it will have to visit if you execute a query that contains an equality test on a complete key value for this index.

So where do the costs for our queries come from? As far as Oracle is concerned, if we fire in the key value 45, we get the data from table T1 by hitting one index leaf block and one table block — two blocks, so a cost of two.

If we try the same with table T2, we have to hit one index leaf block and 15 table blocks — a total of 16 blocks, so a cost of 16.

Clearly, according to this viewpoint, the index on table T1 is much more desirable than the index on table T2. This leaves two questions outstanding, though:

Where does the tablescan cost come from, and why are the figures for the avg_data_blocks_per_key so different between the two tables?

The answer to the second question is simple. Look back at the definition of table T1 — it uses the trunc() function to generate the N1 values, dividing the "rownum – 1 "by 15 and truncating.

Trunc(675/15) = 45
Trunc(676/15) = 45

Trunc(689/15) = 45

All the rows with the value 45 do actually appear one after the other in a tight little clump (probably all fitting one data block) in the table.

Table T2 uses the mod() function to generate the N1 values, using modulus 200 on the rownum:

mod(45,200) = 45
mod(245,200) = 45

mod(2845,200) = 45

The rows with the value 45 appear every two hundredth position in the table (probably resulting in no more than one row in every relevant block).

By doing the analyze, Oracle was able to get a perfect description of the data scatter in our table. So the optimiser was able to work out exactly how many blocks Oracle would have to visit to answer our query — and, in simple cases, the number of block visits is the cost of the query.
But Why the Tablescan?

So we see that an indexed access into T2 is more expensive than the same path into T1, but why has Oracle switched to the tablescan?

This brings us to the two simple-minded, and rather inappropriate, assumptions that Oracle makes.

The first is that every block acquisition equates to a physical disk read, and the second is that a multiblock read is just as quick as a single block read.

So what impact do these assumptions have on our experiment?

If you query the user_tables view with the following SQL:

from user_tables;

you will find that our two tables each cover 96 blocks.

At the start of the article, I pointed out that the test case was running a version 8 system with the value 8 for the db_file_multiblock_read_count.

Roughly speaking, Oracle has decided that it can read the entire 96 block table in 96/8 = 12 disk read requests.

Since it takes 16 block (= disk read) requests to access the table by index, it is clearer quicker (from Oracle's sadly deluded perspective) to scan the table — after all 12 is less than 16.

Voila! If the data you are targetting is suitably scattered across the table, you get tablescans even for a very small percentage of the data — a problem that can be exaggerated in the case of very big blocks and very small rows.

In fact, you will have noticed that my calculated number of scan reads was 12, whilst the cost reported in the execution plan was 15. It is a slight simplfication to say that the cost of a tablescan (or an index fast full scan for that matter) is

'number of blocks' /

Oracle uses an "adjusted" multi-block read value for the calculation (although it then tries to use the actual requested size when the scan starts to run).

For reference, the following table compares a few of the actual and adjusted values:
Actual Adjusted
4 4.175
8 6.589
16 10.398
32 16.409
64 25.895
128 40.865

As you can see, Oracle makes some attempt to protect you from the error of supplying an unfeasibly large value for this parameter.

There is a minor change in version 9, by the way, where the tablescan cost is further adjusted by adding one to result of the division — which means tablescans in V9 are generally just a little more expensive than in V8, so indexes are just a little more likely to be used.

We have seen that there are two assumptions built into the optimizer that are not very sensible.

* A single block read costs just as much as a multi-block read — (not really likely, particularly when running on file systems without direction)
* A block access will be a physical disk read — (so what is the buffer cache for?)

Since the early days of Oracle 8.1, there have been a couple of parameters that allow us to correct these assumption in a reasonably truthful way.

See Tim Gorman's article for a proper description of these parameters, but briefly:

Optimizer_index_cost_adj takes a value between 1 and 10000 with a default of 100. Effectively, this parameter describes how cheap a single block read is compared to a multiblock read. For example the value 30 (which is often a suitable first guess for an OLTP system) would tell Oracle that a single block read costs 30% of a multiblock read. Oracle would therefore incline towards using indexed access paths for low values of this parameter.

Optimizer_index_caching takes a value between 0 and 100 with a default of 0. This tells Oracle to assume that that percentage of index blocks will be found in the buffer cache. In this case, setting values close to 100 encourages the use of indexes over tablescans.

The really nice thing about both these parameters is that they can be set to "truthful" values.

Set the optimizer_index_caching to something in the region of the "buffer cache hit ratio." (You have to make your own choice about whether this should be the figure derived from the default pool, keep pool, or both).

The optimizer_index_cost_adj is a little more complicated. Check the typical wait times in v$system_event for the events "db file scattered read" (multi block reads) and "db file sequential reads" (single block reads). Divide the latter by the former and multiply by one hundred.

Don't forget that the two parameters may need to be adjusted at different times of the day and week to reflect the end-user workload. You can't just derive one pair of figures, and use them for ever.

Happily, in Oracle 9, things have improved. You can now collect system statistics, which are originally included just the four:

+ Average single block read time
+ Average multi block read time
+ Average actual multiblock read
+ Notional usable CPU speed.

Suffice it to say that this feature is worth an article in its own right — but do note that the first three allow Oracle to discover the truth about the cost of multi block reads. And in fact, the CPU speed allows Oracle to work out the CPU cost of unsuitable access mechanisms like reading every single row in a block to find a specific data value and behave accordingly.

When you migrate to version 9, one of the first things you should investigate is the correct use of system statistics. This one feature alone may reduce the amount of time you spend trying to "tune" awkward SQL.

In passing, despite the wonderful effect of system statistics both of the optimizer adjusting parameters still apply — although the exact formula for their use seems to have changed between version 8 and version 9.
Variations on a Theme

Of course, I have picked one very special case — equality on a single column non-unique index, where thare are no nulls in the table — and treated it very simply. (I haven't even mentioned the relevance of the index blevel and clustering_factor yet.) There are numerous different strategies that Oracle uses to work out more general cases.

Consider some of the cases I have conveniently overlooked:

+ Multi-column indexes
+ Part-used multi-column indexes
+ Range scans
+ Unique indexes
+ Non-unique indexes representing unique constraints
+ Index skip scans
+ Index only queries
+ Bitmap indexes
+ Effects of nulls

The list goes on and on. There is no one simple formula that tells you how Oracle works out a cost — there is only a general guideline that gives you the flavour of the approach and a list of different formulae that apply in different cases.

However, the purpose of this article was to make you aware of the general approach and the two assumptions built into the optimiser's strategy. And I hope that this may be enough to take you a long way down the path of understanding the (apparently) strange things that the optimiser has been known to do.

March 11, 2010

How to Troubleshooting Bad Execution Plans

Filed under: [System Performance tuning] — Tags: , — zhefeng @ 11:36 am

Very good sql tuning artical from Greg Rahn

Original Link:

One of the most common performance issues DBAs encounter are bad execution plans. Many try to resolve bad executions plans by setting optimizer related parameters or even hidden underscore parameters. Some even try to decipher a long and complex 10053 trace in hopes to find an answer. While changing parameters or analyzing a 10053 trace might be useful for debugging at some point, I feel there is a much more simple way to start to troubleshoot bad execution plans.

Verify The Query Matches The Business Question

This seems like an obvious thing to do, but I’ve seen numerous cases where the SQL query does not match the business question being asked. Do a quick sanity check verifying things like: join columns, group by, subqueries, etc. The last thing you want to do is consume time trying to debug a bad plan for an improperly written SQL query. Frequently I’ve found that this is the case for many of those “I’ve never got it to run to completion” queries.

What Influences The Execution Plan

I think it’s important to understand what variables influence the Optimizer in order to focus the debugging effort. There are quite a number of variables, but frequently the cause of the problem ones are: (1) non-default optimizer parameters and (2) non-representative object/system statistics. Based on my observations I would say that the most abused Optimizer parameters are:


Many see setting these as a solution to get the Optimizer to choose an index plan over a table scan plan, but this is problematic in several ways:

1. This is a global change to a local problem
2. Although it appears to solve one problem, it is unknown how many bad execution plans resulted from this change
3. The root cause of why the index plan was not chosen is unknown, just that tweaking parameters gave the desired result
4. Using non-default parameters makes it almost impossible to correctly and effectively troubleshoot the root cause

Object and system statistics can have a large influence on execution plans, but few actually take the time to sanity check them during triage. These statistics exist in views like:



As a first step of triage, I would suggest executing the query with a GATHER_PLAN_STATISTICS hint followed by a call to DBMS_XPLAN.DISPLAY_CURSOR. The GATHER_PLAN_STATISTICS hint allows for the collection of extra metrics during the execution of the query. Specifically, it shows us the Optimizer’s estimated number of rows (E-Rows) and the actual number of rows (A-Rows) for each row source. If the estimates are vastly different from the actual, one probably needs to investigate why. For example: In the below plan, look at line 8. The Optimizer estimates 5,899 rows and the row source actually returns 5,479,000 rows. If the estimate is off by three orders of magnitude (1000), chances are the plan will be sub-optimal. Do note that with Nested Loop Joins you need to multiply the Starts column by the E-Rows column to get the A-Rows values (see line 10).
view source
01 select /*+ gather_plan_statistics */ … from … ;
02 select * from table(dbms_xplan.display_cursor(null, null, ‘ALLSTATS LAST’));
04 ——————————————————————————————
05 | Id | Operation | Name | Starts | E-Rows | A-Rows |
06 ——————————————————————————————
07 | 1 | SORT GROUP BY | | 1 | 1 | 1 |
08 |* 2 | FILTER | | 1 | | 1728K |
09 | 3 | NESTED LOOPS | | 1 | 1 | 1728K |
10 |* 4 | HASH JOIN | | 1 | 1 | 1728K |
11 | 5 | PARTITION LIST SINGLE | | 1 | 6844 | 3029 |
12 |* 6 | INDEX RANGE SCAN | PROV_IX13 | 1 | 6844 | 3029 |
13 | 7 | PARTITION LIST SINGLE | | 1 | 5899 | 5479K |
14 |* 8 | TABLE ACCESS BY LOCAL INDEX ROWID | SERVICE | 1 | 5899 | 5479K |
15 |* 9 | INDEX SKIP SCAN | SERVICE_IX8 | 1 | 4934 | 5479K |
16 | 10 | PARTITION LIST SINGLE | | 1728K | 1 | 1728K |
17 |* 11 | INDEX RANGE SCAN | CLAIM_IX7 | 1728K | 1 | 1728K |
18 ——————————————————————————————


Now that I’ve demonstrated how to compare the cardinality estimates to the actual number of rows, what are the debugging options? If one asserts that the Optimizer will choose the optimal plan if it can accurately estimate the number of rows, one can test using the not so well (un)documented CARDINALITY hint. The CARDINALITY hint tells the Optimizer how many rows are coming out of a row source. The hint is generally used like such:
view source
1 select /*+ cardinality(a 100) */ * from dual a;
3 ————————————————————————–
4 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
5 ————————————————————————–
6 | 0 | SELECT STATEMENT | | 100 | 200 | 2 (0)| 00:00:01 |
7 | 1 | TABLE ACCESS FULL| DUAL | 100 | 200 | 2 (0)| 00:00:01 |
8 ————————————————————————–

In this case I told the Optimizer that DUAL would return 100 rows (when in reality it returns 1 row) as seen in the Rows column from the autotrace output. The CARDINALITY hint is one tool one can use to give the Optimizer accurate information. I usually find this the best way to triage a bad plan as it is not a global change, it only effects a single execution of a statement in my session. If luck has it that using a CARDINALITY hint yields an optimal plan, one can move on to debugging where the cardinality is being miscalculated. Generally the bad cardinality is the result of non-representative table/column stats, but it also may be due to data correlation or other factors. This is where it pays off to know and understand the size and shape of the data. If the Optimizer still chooses a bad plan even with the correct cardinality estimates, it’s time to place a call to Oracle Support as more in-depth debugging is likely required.

Where Cardinality Can Go Wrong

There are several common scenarios that can lead to inaccurate cardinality estimates. Some of those on the list are:

1. Data skew: Is the NDV inaccurate due to data skew and a poor dbms_stats sample?
2. Data correlation: Are two or more predicates related to each other?
3. Out-of-range values: Is the predicate within the range of known values?
4. Use of functions in predicates: Is the 5% cardinality guess for functions accurate?
5. Stats gathering strategies: Is your stats gathering strategy yielding representative stats?

Some possible solutions to these issues are:

1. Data skew: Choose a sample size that yields accurate NDV. Use DBMS_STATS.AUTO_SAMPLE_SIZE in 11g.
2. Data correlation: Use Extended Stats in 11g. If <= use a CARDINALITY hint if possible.
3. Out-of-range values: Gather or manually set the statistics.
4. Use of functions in predicates: Use a CARDINALITY hint where possible.
5. Stats gathering strategies: Use AUTO_SAMPLE_SIZE. Adjust only where necessary. Be mindful of tables with skewed data.

How To Best Work With Oracle Support

If you are unable to get to the root cause on your own, it is likely that you will be in contact with Oracle Support. To best assist the support analyst I would recommend you gather the following in addition to the query text:

2. SQLTXPLAN output. See Metalink Note 215187.1
3. 10053 trace output. See Metalink Note 225598.1
4. DDL for all objects used (and dependencies) in the query. This is best gotten as a expdp (data pump) using CONTENT=METADATA_ONLY. This will also include the object statistics.
5. Output from: select pname, pval1 from sys.aux_stats$ where sname='SYSSTATS_MAIN';
6. A copy of your init.ora

Having this data ready before you even make the call (or create the SR on-line) should give you a jump on getting a quick(er) resolution.


While this blog post is not meant to be a comprehensive troubleshooting guide for bad execution plans, I do hope that it does help point you in the right direction the next time you encounter one. Many of the Optimizer issues I’ve seen are due to incorrect cardinality estimates, quite often due to inaccurate NDV or the result of data correlation. I believe that if you use a systematic approach you will find that debugging bad execution plans may be as easy as just getting the cardinality estimate correct.

Older Posts »

Blog at WordPress.com.