Tuesday, December 20, 2011

RMAN-06026: some targets not found - RMAN-06024: no backup or copy of the control file found

Interesting situation I ran into and I have not quite figured out the why yet...This has to do with duplicating a database with RMAN on version 11gR2 (11.2.0.2).

Background: Every six months at one of my clients, we do a tape restore recovery test. Normally, we back up the database to the FRA, then remove the backups from disk every couple of weeks. Nightly, the backupsets are copied to tape. To perform the test, I choose a backup period that is beyond when we removed the backups from disk. Therefore, I need to have the sys admin restore from tape to the FRA. Upon doing so, I "CATALOG START WITH" to put the backup pieces back into the catalog.

The way we have the backups configured is we do a weekly level 0 and daily level 1's. To do the recovery test, I tell the SA to restore the level 0 directory (date) and the level 1 directory (date) as well as the corresponding control files autobackup directories.

Then I choose a time that falls right in between the two of them.

For example, this time, my level 0 completed at 11:23 PM on 11/20, and my level 1 completed at 6:45 AM on 11/21. So my recovery time was set to 3:45 AM on 11/21. I use this setup then to duplicate the database to a new host with a new database name. Of course, since this is point-in-time, I have to copy the backups to the new machine.

On the new machine, I configure the TNS, a dedicated listener, and an init.ora...

I start up the AUXILIARY in nomount mode...

From the TARGET machine I:

rman target / auxiliary sys@AUXDUP nocatalog

Here is my script:

run {
set until time = "to_date('2011-11-21:03:45:00','YYYY-MM-DD:HH24:MI:SS')";
duplicate target database to AUXDUP
DB_FILE_NAME_CONVERT=('/ora01/oradata/PROD/','/ora01/oradata/AUXDUP/')
LOGFILE
group 1 ('/ora01/oradata/AUXDUP/redo01a.log') size 256M reuse,
group 2 ('/ora01/oradata/AUXDUP/redo02a.log') size 256M reuse,
group 3 ('/ora01/oradata/AUXDUP/redo03a.log') size 256M reuse
nofilenamecheck;
}

Today, when I ran this, I received the following error:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 12/20/2011 11:25:47
RMAN-05501: aborting duplication of target database
RMAN-03015: error occurred in stored script Memory Script
RMAN-06026: some targets not found - aborting restore
RMAN-06024: no backup or copy of the control file found to restore


Now, there are many reasons that you will find on Metalink and the internet for this error. None of those things applied to me. Most of them centered around duplicating for standby.

What makes it particularly confusing is that many of the resources you find will say this error is not really about the controlfile but about the times you choose and backup availability. I knew my time was good and backups were all in place. My problem WAS with the controlfile.

Essentially, this error is telling you it cannot find a copy of the controlfile to restore. I know for a fact I restored and cataloged my controlfile backups as well as copied them to the new host.

However, when I tried to list my controlfile backups, I got:

RMAN> list backup of controlfile completed before '22-NOV-2011';

specification does not match any backup in the repository

What? I know I cataloged them.

Let's catalog them again, just to be sure:

RMAN> catalog start with '/ora01/flash_recovery_area/PROD';

searching for all files that match the pattern /ora01/flash_recovery_area/PROD

List of Files Unknown to the Database
=====================================
File Name: /ora01/flash_recovery_area/PROD/autobackup/2011_11_20/o1_mf_s_767748081_7dmng414_.bkp
File Name: /ora01/flash_recovery_area/PROD/autobackup/2011_11_20/o1_mf_s_767748205_7dmnkzny_.bkp
File Name: /ora01/flash_recovery_area/PROD/autobackup/2011_11_21/o1_mf_s_767774019_7dnfroro_.bkp
File Name: /ora01/flash_recovery_area/PROD/autobackup/2011_11_21/o1_mf_s_767774118_7dnfvrhl_.bkp

Do you really want to catalog the above files (enter YES or NO)? YES
cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: /ora01/flash_recovery_area/PROD/autobackup/2011_11_20/o1_mf_s_767748081_7dmng414_.bkp
File Name: /ora01/flash_recovery_area/PROD/autobackup/2011_11_20/o1_mf_s_767748205_7dmnkzny_.bkp
File Name: /ora01/flash_recovery_area/PROD/autobackup/2011_11_21/o1_mf_s_767774019_7dnfroro_.bkp
File Name: /ora01/flash_recovery_area/PROD/autobackup/2011_11_21/o1_mf_s_767774118_7dnfvrhl_.bkp

So, you can see, I cataloged them again with no specific errors.

Try listing them:

RMAN> list backup of controlfile completed before '22-NOV-2011';

specification does not match any backup in the repository

So, for some reason, RMAN is not cataloging the files.

I checked my controlfile_record_keep_time (not sure that this would make a difference) and it was set to 60 days. Well within the recovery window.

In a short amount of searching, I could find no evidence that this was a reported bug or experienced by others before.

How about if I try listing the specific pieces?

RMAN> list backuppiece '/ora01/flash_recovery_area/PROD/autobackup/2011_11_21/o1_mf_s_767774019_7dnfroro_.bkp';

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of list command at 12/20/2011 11:26:33
RMAN-20260: backup piece not found in the repository
RMAN-06092: error while looking up backup piece

Interesting...clearly it is not recognized by my backup metadata.

How about if I try cataloging the piece itself?

RMAN> catalog backuppiece '/ora01/flash_recovery_area/PROD/autobackup/2011_11_21/o1_mf_s_767774019_7dnfroro_.bkp';

cataloged backup piece
backup piece handle=/ora01/flash_recovery_area/PROD/autobackup/2011_11_21/o1_mf_s_767774019_7dnfroro_.bkp RECID=42367 STAMP=770383607

RMAN> list backuppiece '/ora01/flash_recovery_area/PROD/autobackup/2011_11_21/o1_mf_s_767774019_7dnfroro_.bkp';


List of Backup Pieces
BP Key BS Key Pc# Cp# Status Device Type Piece Name
------- ------- --- --- ----------- ----------- ----------
42367 17202 1 1 AVAILABLE DISK /ora01/flash_recovery_area/PROD/autobackup/2011_11_21/o1_mf_s_767774019_7dnfroro_.bkp


Now we are getting somewhere...

Let me catalog the rest of the pieces...

Relist (backup this time, not backuppiece):

RMAN> list backup of controlfile completed before '22-NOV-2011';

using target database control file instead of recovery catalog

List of Backup Sets
===================


BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ --------------------
17202 Full 32.23M DISK 00:00:00 21-NOV-2011 06:33:39
BP Key: 42367 Status: AVAILABLE Compressed: NO Tag: TAG20111121T063339
Piece Name: /ora01/flash_recovery_area/PROD/autobackup/2011_11_21/o1_mf_s_767774019_7dnfroro_.bkp
Control File Included: Ckp SCN: 30756382041 Ckp time: 21-NOV-2011 06:33:39

BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ --------------------
17203 Full 32.23M DISK 00:00:00 21-NOV-2011 06:35:18
BP Key: 42368 Status: AVAILABLE Compressed: NO Tag: TAG20111121T063518
Piece Name: /ora01/flash_recovery_area/PROD/autobackup/2011_11_21/o1_mf_s_767774118_7dnfvrhl_.bkp
Control File Included: Ckp SCN: 30756382106 Ckp time: 21-NOV-2011 06:35:18

BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ --------------------
17204 Full 32.23M DISK 00:00:00 20-NOV-2011 23:21:21
BP Key: 42369 Status: AVAILABLE Compressed: NO Tag: TAG20111120T232121
Piece Name: /ora01/flash_recovery_area/PROD/autobackup/2011_11_20/o1_mf_s_767748081_7dmng414_.bkp
Control File Included: Ckp SCN: 30754959584 Ckp time: 20-NOV-2011 23:21:21

BS Key Type LV Size Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ --------------------
17205 Full 32.23M DISK 00:00:00 20-NOV-2011 23:23:25
BP Key: 42370 Status: AVAILABLE Compressed: NO Tag: TAG20111120T232325
Piece Name: /ora01/flash_recovery_area/PROD/autobackup/2011_11_20/o1_mf_s_767748205_7dmnkzny_.bkp
Control File Included: Ckp SCN: 30754964327 Ckp time: 20-NOV-2011 23:23:25

Voila!

So, in conclusion, I am not sure if this is a bug or what, but, I most certainly did not have to go through this before we upgraded our 10.2.0.4 database to 11.2.0.2 a couple months ago.
Apparently, I have to specifically catalog each controlfile autobackup piece for the duplicate to work.

BTW, this did not apply to the Level 0's and Level 1's. They cataloged just fine.

Upon doing this, my duplicate ran perfectly.

4 comments:

Unknown said...

Kudos to Chris!

Your blogpost enlighted my path to resolution of a similar duplication issue (RAC database where control file snaps where left locally on all cluster instances) : thank you.

In 10g I remember having to explicitly crosscheck & delete obsolete backup pieces one at a time to get rid of rman errors claiming it couldn't find them anymore (while they where clearly catalogued).

Regards,
Erwin Geeraerts

S. Zydek said...

This was hugely helpful. Thank you for posting!!

S. Zydek said...
This comment has been removed by the author.
Unknown said...

We had a similar experience with 2 databases and two different resolutions. One, I was able to duplicate once I backed up the archivelogs again and copied the backed controlfile to the server being duplicated to; however, the other instance, we could find the backup files but the controlfile backup was purged. No one seems to know how or why. We ended up using the targetless method without the target and recovery catalog but pointed the duplication RMAN session to where the backup files were located. Strange. We are still trying to figure out what happened.