How to find and fix block corruption using RMAN?

One of the scenario we ran into when one of the data file reported there was block corruption as RMAN reported the following message in its logs. Note: This scenario was done on 9.2.0.1 running Linux.

RMAN-03009: failure of backup command on ch1 channel at 07/02/2009 04:27:06
ORA-19566: exceeded limit of 0 corrupt blocks for file /u01/oradata/TESTDB/TEST_data_01.dbf

And the alert.log (alert_TESTDB.log) also had the following message by RMAN
Reread of blocknum=443343, file=/u01/oradata/TESTDB/TEST_data_01.dbf. found same corrupt data
***
Corrupt block relative dba: 0x0346c3cf (file 13, block 443343)
Bad check value found during backing up datafile
Data in bad block –
type: 6 format: 2 rdba: 0x0346c3cf
last change scn: 0x0000.32a8f165 seq: 0x1 flg: 0x04
consistency value in tail: 0xf1650601
check value in block header: 0xeb4f, computed block checksum: 0xec16
spare1: 0x0, spare2: 0x0, spare3: 0x0

So to double check one can ran dbverify (dbv) or RMAN to validate the datafile which confirmed data corruption and the datafile# and the block#.
$ dbv blocksize=8192 file=/u01/oradata/TESTDB/TEST_data_01.dbf > /tmp/TEST_data_01_file.log 2>&1

DBVERIFY – Verification starting : FILE = /u01/oradata/TESTDB/TEST_data_01.dbf
Page 443343 is marked corrupt
***
Corrupt block relative dba: 0x0346c3cf (file 13, block 443343)
Bad check value found during dbv:
Data in bad block –
type: 6 format: 2 rdba: 0x0346c3cf
last change scn: 0x0000.32a8f165 seq: 0x1 flg: 0x04
consistency value in tail: 0xf1650601
check value in block header: 0xeb4f, computed block checksum: 0xec16
spare1: 0x0, spare2: 0x0, spare3: 0x0
***

DBVERIFY – Verification complete

Total Pages Examined : 486400
Total Pages Processed (Data) : 473439
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 10
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 12950
Total Pages Marked Corrupt : 1
Total Pages Influx : 0

And using RMAN one can validate the same thing by checking the view v$database_block_corruption after running the script below.

RMAN> connect target /

RMAN> backup validate check logical database;

It reported the same block as show below:

SQL> SELECT * FROM v$database_block_corruption;

FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
———- ———- ———- —————— ———
13 443343 1 0 FRACTURED

After it was confirmed there was only one datafile with one block corruption using block recover we were able to recover the datafile without shutting down the database. The channel allocated was the same that was specified when backing up the data file.

connect target /
connect catalog rmancatalog/rmancatalog@catalog
run {
allocate channel…;
blockrecover datafile 13 block 443343;
release channel …;
}

Log file:
Starting blockrecover at 02-JUL-09

channel ch1: restoring block(s)
channel ch1: specifying block(s) to restore from backup set
restoring blocks of datafile 00013
channel ch1: restored block(s) from backup piece 1
piece handle=g0kirlhe_1_1 tag=TAG20090630T040048 params=NULL
channel ch1: block restore complete

starting media recovery

archive log thread 1 sequence 30644 is already on disk as file /u01/oradata/TESTDB/arch/1_30644.dbf
archive log thread 1 sequence 30645 is already on disk as file /u01/oradata/TESTDB/arch/1_30645.dbf
channel ch1: starting archive log restore to default destination
channel ch1: restoring archive log
archive log thread=1 sequence=30643
channel ch1: restored backup piece 1
piece handle=hlkj27vr_1_1 tag=TAG20090701T160827 params=NULL
channel ch1: restore complete
media recovery complete
Finished blockrecover at 02-JUL-09
released channel: ch1

Recovery Manager complete.

And after the block was restored we validated the datafile through RMAN and v$database_block_corruption reported no records found.

RMAN> connect target /
RMAN> backup validate check logical datafile 13;

Starting backup at 02-JUL-09
using target database controlfile instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=22 devtype=DISK
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00013 name=/u01/oradata/TESTDB/TEST_data_01.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:07:35
Finished backup at 02-JUL-09

sys@TESTDB> select * FROM v$database_block_corruption;

no rows selected

2 thoughts on “How to find and fix block corruption using RMAN?

  1. Amin,

    I have got a question looking on the “blockrecover datafile 13 block 443343;” command log file.
    It looks like Oracle reads backups to retrieve last know “good” block state from the backups and apply ALL archivelogs since backup were taken.
    Is it right?
    If it is does it means that the time and resources to recover one block is proportion of restoring ALL archovelogs and applying relative vectors to one block? It seams like a bit of resource intensive operation to me and in real world scenarios this might be long running challenging process.

    Yury

    • Hi Yury,

      Yes, RMAN restored the block from the last backup and then RMAN applied the archive logs. I think it would be faster to restore one block instead of restoring the whole datafile as once that block is found it wouldn’t need to restore rest of the datafile. Oracle suggests using block media recovery is not useful in cases where the extent of data loss or corruption is not known; in this case, use datafile recovery instead.

      Here is the link from Oracle on Performing Block Media Recovery with RMAN
      http://download.oracle.com/docs/cd/B19306_01/backup.102/b14191/rcmrecov.htm#i1006577

      Take care
      amin

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s