Monday, November 9, 2015

Oracle RMAN Catalog Resync: Out of nowhere...ORA-00001: unique constraint (RMAN.DF_U2) violated (FIXED)

We use the RMAN catalog pretty heavily.  It's been working great without too much issue.  However, out of nowhere this weekend, we started getting the following error when one of our databases tried re-syncing with the catalog:

ORA-00001: unique constraint (RMAN.DF_U2) violated

First, as usual, some details:

RDBMS Version: 11.2.0.4.7
RDBMS OS Version: AIX 6.1

RMAN Schema Version: 12.1.0.2
RMAN Catalog OS: OEL6, RH Compatible Kernel
RMAN Catalog RDBMS Version: 11.2.0.4.7


The way I have my backups running is they first back up to disk, then, back up to SBT_TAPE.  These steps are done in NOCATALOG mode.  After the backups are done, the database makes a new connection to the RMAN catalog and does a resync.  I choose to do my backups this way so if the catalog is ever not available for some reason, it won't cause the backup script to error out upon connection to the catalog.  If a resync fails, no big deal, it'll catch up next time.  I just get an email warning me of the error.

So, to fix this, the first place I went was to MOS.  I found a bug that matched this error but it was supposed to be fixed in 11g.

Oracle Support Document 434398.1 (Implicit Resync Of Recovery Catalog Failed -ORA-00001: unique constraint DF_U2 violated) can be found at: https://support.oracle.com/epmos/faces/DocumentDisplay?id=434398.1


I, of course, was past this version.  I hate working with MOS and I hated the fact that their "workaround" was to just delete the constraint.  Don't get me wrong, I don't mind updating things in the database realm when Oracle asks me to do so.  However, this came with no further explanation...just drop the constraint.  What happens if I upgrade the catalog in future release?  Where's the patch to do the DROP?  Why have me do it?  I just thought it was sloppy. Especially since I was already beyond the version in which Oracle said it was fixed. It's like "Oh, that piano doesn't fit through the doorway?  Just saw off the legs and throw them away".

I thought I would try something quick myself first.  I decided I would try un-registering and re-registering the database from the catalog.

So, before you try this, a quick word of warning...if you unregister the database from the catalog, and, your database parameter control_file_record_keep_time is shorter than your backup retention, you run the risk of losing backup metadata.  This is not the end of the world but can be a huge inconvenience if you happen to need that information.  You would have to manually register your backup pieces into the catalog before you could use them.  In all reality, I think for most people, this is a highly unlikely situation...but, I just thought I would mention it.  I always keep my control_file_record_keep_time a couple days longer than my RMAN retention settings just in case.

So, the process is pretty simple.

Connect the target and the catalog:

rman target / catalog rcat_owner@rcvcat

Of course, substitute your own credentials above...

RMAN> unregister database;

--it'll ask if you are sure...type YES

After the target successfully unregisters, I like to log out of the catalog and wait a few minutes before re-registering.  This might just be me being crazy, but, I swear in the past when unregistering and re-registering targets I sometimes got an error if I did it in the same session immediately back to back...maybe that was a bug in years past...who knows.

In any case, log out and back into your target/catalog and type:

RMAN> register database;

Might take a minute...but, when you are done, you should be good to go...at least I was...YMMV.  Good luck!