| Author |
Topic  |
|
|
shishirkhandekar
Starting Member
17 Posts |
Posted - 01/30/2011 : 18:06:04
|
Just today we were testing a failover of our databases using the DoubleTake software. Once the primary server was switched off and the secondary brought up, one of our databases did not come online. The state_desc for the database was "Recovery Pending". When I tried to restore from a database backup the Doubletake had replicated, the header record showed something interesting.
The backup name column (Name column in the RESTORE gui) was *** INCOMPLETE *** and all the other columns were empty.
My feeling is that the backup file did not, for some reason, replicate correctly.
Is there something more to this than what I am thinking?
Also we received a Severity 24 alert that said:
SQL Server detected a logical consistency-based I/O error: incorrect pageid (expected 1:32; actual 0:0). It occurred during a read of page (1:32) in database ID 9 at offset 0x00000000040000 in file <filename.mdf>. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
How do you recover from such scenarios or how do you fix them?
Thanks for any insight.
Shishir Khandekar |
Edited by - shishirkhandekar on 01/30/2011 18:15:55
|
|
|
tkizer
Almighty SQL Goddess
USA
35007 Posts |
|
|
shishirkhandekar
Starting Member
17 Posts |
Posted - 01/30/2011 : 20:11:17
|
Tara,
Yes, DBCC CHECKDB ran without any errors on the source database. On the target, I could not do anything as it was in recovery pending and the only thing that was possible was to put it in EMERGENCY mode and then running DBCC. But since this was a failover test and we had some indications of incomplete replications, we did not try to recover.
Shishir |
Edited by - shishirkhandekar on 01/30/2011 20:11:59 |
 |
|
|
GilaMonster
Flowing Fount of Yak Knowledge
South Africa
4507 Posts |
Posted - 01/31/2011 : 00:33:07
|
How you recover from scenarios like you describe is typically restore from a good backup.
If the replica is damaged and the source isn't then either something went wrong in replication or there's something wrong with the IO subsystem on the secondary.
-- Gail Shaw SQL Server MVP |
 |
|
|
shishirkhandekar
Starting Member
17 Posts |
Posted - 01/31/2011 : 09:16:37
|
Thank you, Gail. We are still trying to figure out what went wrong and the suspects, as per the vendor support, are disk fragmentation and/or bad I/O driver causing memory leaks and therefore insufficient resources.
So in short what I understand is that irrespective of how we reached this state, in case of "incorrect page ID" error, the only way out is:
1) EMERGENCY MODE DBCC with REPAIR_ALLOW_DATA_LOSS 2) Recover from a good backup.
If anyone has any more insight, it would be much appreciated.
Thank you Shishir Khandekar |
 |
|
|
GilaMonster
Flowing Fount of Yak Knowledge
South Africa
4507 Posts |
Posted - 01/31/2011 : 09:53:45
|
Your options should be the other way around. Restore from backup is by far the recommended. Emergency mode repair is a last resort.
Disk fragmentation wouldn't cause this. Bad IO driver could, though probably not through memory leak, through incorrectly writing data to the disk or changing it later. SAN controller, SAN cache, HBA drivers, any IO filter drives, actual disk hardware, etc.
-- Gail Shaw SQL Server MVP |
 |
|
| |
Topic  |
|