Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 Data Corruption Issues
 Index Fail Error 644

Author  Topic 

Westley
Posting Yak Master

229 Posts

Posted - 2006-01-19 : 00:16:00
Guys,
We just got an Error 644 on our DB, the error was:

Could not find the index entry for RID '169bc61301c3950000209e1a0002000000030000' in index page (1:1683508), index ID 0, database 'TR'..

Error: 644, Severity: 21, State: 3

We run dbcc CheckDB on the TR DB with no error:

CHECKDB found 0 allocation errors and 0 consistency errors in database 'TR'.

I read JohnSharp post, and tried the script from Paul:

DBCC CHECKTABLE (sysindexes) WITH NO_INFOMSGS, ALL_ERRORMSGS

it return no erros as well, anyone got any more info on whats this error is?

At the time of this happen, its seems that a process is running, which run some stored proc which will create some temp tables, will that message be relating to some index that created on the tempdb (from the temp tables) rather then the index on the TR DB? Since we can't even tell which table its affected, is there any way to find out?

Thanks.

paulrandal
Yak with Vast SQL Skills

899 Posts

Posted - 2006-01-19 : 00:26:13
Man, you're not having the best time if you're back in this forum again.

The error means that a query found a non-clustered index record and was going back to the heap to get some more data but the physical row-locator stored in the NC index record did not exist in the heap.

The severity of the error indicates that you weren't using a NOLOCK scan or read-uncommitted serialization mode. Is that correct? 644s in these circumstances are usually converted to sev 12 errors.

Can you do a DBCC PAGE (TR, 1, 1683508, 3)?

Thanks

Paul Randal
Lead Program Manager, Microsoft SQL Server Storage Engine
(Legalese: This posting is provided "AS IS" with no warranties, and confers no rights.)
Go to Top of Page

Westley
Posting Yak Master

229 Posts

Posted - 2006-01-19 : 00:44:38
Thanks Paul, Just did the DBCC PAGE (TR, 1, 1683508, 3) and nothing return, guess the page doesn't exists anymore, since the dbcc checkdb returns nothing as well, but I would like to know if that can be some kind of a temp index or something that cause this? or will that be a phyiscal table index? and will there be a way to find out which table its referring to when u look at that rid?
We can not reproduce the error again as we try to run that same stored proc, everything seems to be fine, any ideas?
Go to Top of Page

Westley
Posting Yak Master

229 Posts

Posted - 2006-01-19 : 00:49:32
By the way, I'm actually on the other side of the world, so its not that late here compare to yours :) I actually though I won't get a reply from you until tomorrow. Thanks for getting back to me, and hope that won't take up too much of your sleep time :)

Go to Top of Page

paulrandal
Yak with Vast SQL Skills

899 Posts

Posted - 2006-01-19 : 02:01:53
Not that late here - only 11pm and I subscribe to this forum so I get emailed when anyone posts.

There's no way to tell from the RID. The only way to tell is if the page still exists.

Ah - do a DBCC TRACEON (3604) before doing the DBCC PAGE. You should get some output. Even if the page is deallocated, DBCC PAGE will still display contents. Only if the page is off the end of the file will it not show anything, but it will print an error message.

Thanks

Paul Randal
Lead Program Manager, Microsoft SQL Server Storage Engine
(Legalese: This posting is provided "AS IS" with no warranties, and confers no rights.)
Go to Top of Page

Westley
Posting Yak Master

229 Posts

Posted - 2006-01-19 : 02:45:32
ah, yeah, forgot about 3604, here it is:


PAGE: (1:1683508)
-----------------

BUFFER:
-------

BUF @0x01A16580
---------------
bpage = 0x68C1E000 bhash = 0x00000000 bpageno = (1:1683508)
bdbid = 6 breferences = 0 bstat = 0x9
bspin = 0 bnext = 0x00000000

PAGE HEADER:
------------

Page @0x68C1E000
----------------
m_pageId = (1:1683508) m_headerVersion = 1 m_type = 1
m_typeFlagBits = 0x0 m_level = 0 m_flagBits = 0x24
m_objId = 1872777779 m_indexId = 0 m_prevPage = (1:1683507)
m_nextPage = (1:1683509) pminlen = 101 m_slotCnt = 75
m_freeCnt = 71 m_freeData = 7971 m_reservedCnt = 0
m_lsn = (30473:6896:2) m_xactReserved = 0 m_xdesId = (0:99144427)
m_ghostRecCnt = 0 m_tornBits = 0

Allocation Status
-----------------
GAM (1:1533696) = ALLOCATED
SGAM (1:1533697) = NOT ALLOCATED
PFS (1:1682304) = 0x40 ALLOCATED 0_PCT_FULL
DIFF (1:1533702) = NOT CHANGED
ML (1:1533703) = NOT MIN_LOGGED

anything meaningful?
I don't really get how to read this page info, something you can teach us?
Thanks
Go to Top of Page

Westley
Posting Yak Master

229 Posts

Posted - 2006-01-19 : 02:59:27
Hey Paul,
Just looking at that page info, there is a m_objId field, that object ID is actually one of the table in the stored proc, is that mean there is some "possible" corruption in that table's index? But then is that possible that the "error" page has already been dropped and got recreate with other information with the same page number? so I'm looking at something totally not related to the issue I had before?

Thanks,


oh, only 11pm? but the time of the post shows 2am :)
Go to Top of Page

JohnSharp
Starting Member

4 Posts

Posted - 2006-01-19 : 06:26:59
If it's any help, the way I found which table was broken was to read the sproc and try doing a select * on the tables, the broken one cause SQL to have a big problem and terminate my connection. Then copy the data out, kill the table, copy it back in, job done. Oh, restart the server at some point to helps. It likes a little lie down in the dark.
Go to Top of Page

paulrandal
Yak with Vast SQL Skills

899 Posts

Posted - 2006-01-19 : 14:11:36
To John's post: Restarting the server is probably a heavy-handed way of flushing the buffer pool contents if there's an in-memory-but-not-on-disk corrupted page.

Westley - this table has a clustered index, right? Its a logical RID (based on cluster keys) rather than a physical RID (my previous explanation was a little lacking - late at night, several glasses of wine )

Do you have the situation described in http://support.microsoft.com/kb/822747 ?

If not, you'll need to generate a call stack when it occurs again - by starting the server with -y644 (print a stackdump to the errorlog when error 644 is raised)

Thanks


Paul Randal
Lead Program Manager, Microsoft SQL Server Storage Engine
(Legalese: This posting is provided "AS IS" with no warranties, and confers no rights.)
Go to Top of Page

Westley
Posting Yak Master

229 Posts

Posted - 2006-01-19 : 20:35:59
Thanks guys,
To John: restarting is not really an option here, as dbcc can't find anything, so we are pretty safe, just would like to find out why and what caused it.

Paul, yes, that table does have a clustered index, so the objid shown in the dbcc page is the table in question? If so, I will try to search more about that table to see if there are anything interesting there.
As for the kb, I don't see any column using Latin1_General_BIN, so don't think it'll be the case, with the stackdump, if we do get it, is that something you can help me to look at? or we should pass that to MS PSS? Also, I remember we can set that flag on the fly without a s restart isn't it? something like a traceflag (644) or something?

Thanks
Go to Top of Page

paulrandal
Yak with Vast SQL Skills

899 Posts

Posted - 2006-01-20 : 14:12:06
In SQL2k, the objid on the page is always the object ID of the table. This is not the case in SQL 2005.

Yes, if you get a stackdump I can look at it.

There's a undocumented way to turn on dumping for a particular error. I'll PM you the command to use.

Thanks

Paul Randal
Lead Program Manager, Microsoft SQL Server Storage Engine
(Legalese: This posting is provided "AS IS" with no warranties, and confers no rights.)
Go to Top of Page
   

- Advertisement -