Category Archives: RMAN

Multiplex controlfiles with OMF

Even on the latest Oracle Engineered Systems like ODA X6-2S/M there’s only one single controlfile created by default. Despite the storage redundancy there’s always a risk of someone accidentally deleting a file or some logical corruption of the file.
Therefore, we always multiplex the controlfiles and online redo logs. But, what is the easiest way of adding a second controlfile to a database using OMF on a file system?

I think it’s the following procedure. Although, I’m gladly proven wrong with a even more elegant solution.

The command output has been stripped to the relevant information.

$ . oraenv
ORACLE_SID = [DUMMY] ? DEV1

$ rman target /


-- see, there is only one online log destination
RMAN> select name, value from v$parameter where name like 'db_create_online_log_dest_%';

NAME                                   VALUE
-------------------------------------- --------------------------------------
db_create_online_log_dest_1            /u03/app/oracle/redo/
db_create_online_log_dest_2
db_create_online_log_dest_3
db_create_online_log_dest_4
db_create_online_log_dest_5


-- set a second online log destination
RMAN> alter system set db_create_online_log_dest_2 = '/u02/app/oracle/oradata/DEV1' scope=both;


-- get the current controlfile path for later reference in commands
RMAN> select value from v$parameter where name = 'control_files';

VALUE
--------------------------------------------------------------------------------
/u03/app/oracle/redo/DEV1/controlfile/o1_mf_cwxjo46k_.ctl


-- this will make RMAN "restore" command create new controlfiles in all online log destinations
RMAN> alter system reset control_files;


-- restart instance so parameter changes will take effect
RMAN> shutdown immediate


-- only start in NOMOUNT as to be able to restore controlfile
RMAN> startup nomount


-- this will create two new OMF controlfiles based on existing one
--   remember, we have reset control_files parameter
RMAN> restore controlfile from '/u03/app/oracle/redo/DEV1/controlfile/o1_mf_cwxjo46k_.ctl';
...
channel ORA_DISK_1: copied control file copy
output file name=/u03/app/oracle/redo/DEV1/controlfile/o1_mf_cxf4q747_.ctl
output file name=/u02/app/oracle/oradata/DEV1/DEV1/controlfile/o1_mf_cxf4q74v_.ctl
Finished restore at 12-SEP-16


-- use the two newly created controlfiles at next startup
RMAN> alter system set control_files = '/u03/app/oracle/redo/DEV1/controlfile/o1_mf_cxf4q747_.ctl','/u02/app/oracle/oradata/DEV1/DEV1/controlfile/o1_mf_cxf4q74v_.ctl' scope=spfile;


-- the old controlfile is not needed anymore
RMAN> host "rm /u03/app/oracle/redo/DEV1/controlfile/o1_mf_cwxjo46k_.ctl";

RMAN> startup force

Since I’m using Oracle 12c RMAN is capable of running SQL just like this…no need to change between SQL*Plus and RMAN anymore.

(Please don’t debate the file system structure, it’s from an engineered system. I think it’s horrible!)

RMAN in FIRST_ROWS hell

A short while back I was doing a database upgrade/migration from 11.2.0.1 to 12.1.0.2 to a new server. To keep the downtime of the 850GB big database short I used Transportable Tablespaces together with incrementally updated backups where the datafile copies are placed on the target machine’s storage using a shared mount. So, during the downtime (as soon as the Tablespaces are set READ ONLY) all there’s left is doing the last inc 1 backup and the meta data export / import. Everything went fine on all test databases which were freshly cloned from production.
But then, around comes time for production. I started taking the inc 0 datafile copy backups a few days before. Eight long hours later I was ready to take inc 1 backups from time to time and apply them on the datafile copies. This is where all the good plans went south. Every time the “BACKUP INCREMENTAL LEVEL 1 FOR RECOVER OF COPY WITH TAG…” command ran it took about 23 seconds before RMAN actually started taking the backup. During the production downtime this is probably fine if there’s only 5 datafiles. Our database had more than 50 datafiles. And, you have to account for 50 x 23 seconds for the “RECOVER COPY OF DATAFILE…” command as well as the same problem applies there, too. Clearly, this issue needed resolving before the production downtime.

My investigation showed that above RMAN commands trigger the calling of DBMS_RCVMAN.getRcvRec which in turn calls DBMS_RCVMAN.getDataFileCopy. In there are some complex queries involving V$BACKUPSET, V$DATAFILE_COPY, etc. and one of these queries did have a very bad execution plan. First I thought there might be a problem with the object statistics on the underlying X$ tables, namely X$KCCBF, X$KCCBP and X$KCCBC. OK, the stats were somewhat stale so I gathered them for the involved X$ tables. Unfortunately, this didn’t solve the problem of the bad execution plan. Then, I remembered that during the initial analysis of the database I noticed that OPTIMIZER_MODE was set to FIRST_ROWS on the instance level (for whatever reason the SW vendor claimed this was best). Of course, this setting also affected RMAN. As the database was still fully productive I couldn’t just change the parameter to ALL_ROWS. Setting up a login trigger for RMAN seemed too intrusive. The solution was simple: run an ALTER SESSION at the start of the RMAN session and all is fine…

sql "alter session set optimizer_mode = ALL_ROWS";

Btw., on the new 12.1.0.2 database the application runs just perfectly with ALL_ROWS 😉