Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 Data Corruption Issues
 Query "stops" - Corruption?

Author  Topic 

Kristen
Test

22859 Posts

Posted - 2011-12-08 : 06:10:28
Schedule Sproc "stops" half way though. (Been running for months without issue)

Running it manually suggests it is failing on a particular DELETE. I changed the DELETE to a SELECT COUNT(*) and ran that one statement manually and got:

The command(s) completed successfully.

surely I should get a row output from SELECT COUNT(*)?

Query is

RAISERROR (N'Delete Start', 10, 1) WITH NOWAIT

DELETE D
-- SELECT COUNT(*)
FROM dbo.MyTable AS D
JOIN ##TEMP_MyTable AS S
ON S.MyPK1 = D.MyPK1 COLLATE SQL_Latin1_General_CP1_CI_AS
AND S.MyPK2 = D.MyPK2
WHERE
(
(S.MyCol3 <> D.MyCol3 COLLATE SQL_Latin1_General_CP1_CI_AS
OR (S.MyCol3 IS NULL AND D.MyCol3 IS NOT NULL)
OR (S.MyCol3 IS NOT NULL AND D.MyCol3 IS NULL))
OR (S.MyCol4 <> D.MyCol4 COLLATE SQL_Latin1_General_CP1_CI_AS
OR (S.MyCol4 IS NULL AND D.MyCol4 IS NOT NULL)
OR (S.MyCol4 IS NOT NULL AND D.MyCol4 IS NULL))
... etc. ...
)
SELECT @intRowDel = @@ROWCOUNT

RAISERROR (N'Delete Completed', 10, 1) WITH NOWAIT

Running the SProc I see the "Delete Start", but nothing after that - not even the "Delete Completed". Transaction the Sproc is running in is terminated, no value returned, the batch is aborted (so I can't display the return value even if there is one)

This in the (application) Event Log which tallies with the time that the query stopped working properly:

"A page that should have been constant has changed (expected checksum: 148697dd, actual checksum: 143eb63d, database 8, file 'E:\MSSQL\DATA\MyDatabase.mdf', page (1:1866135)). This usually indicates a memory failure or other hardware or OS corruption."

Ran this:

DBCC CHECKDB WITH NO_INFOMSGS, EXTENDED_LOGICAL_CHECKS, DATA_PURITY

on the database, TEMPDB, MASTER and MSDB. No errors found.

SELECT @@version =

Microsoft SQL Server 2008 R2 (RTM) - 10.50.1600.1 (X64)
Apr 2 2010 15:48:46
Copyright (c) Microsoft Corporation
Standard Edition (64-bit) on Windows NT 5.2 <X64> (Build 3790: Service Pack 2) (Hypervisor)

Database is set to SIMPLE Recovery Model

Any ideas?

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2011-12-08 : 06:55:27
That's corruption, but not data file corruption. It's a memory error. You either have faulty memory chips or there's a memory scribbler. This is horridly difficult to debug, I suggest CSS, but they'll probably need things like stack dumps, debugger results and all such hairy things.

CheckDB won't see this, because checkDB is looking for errors in the file, what you have is a piece of memory that's changing when it shouldn't be.

--
Gail Shaw
SQL Server MVP
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2011-12-08 : 08:12:27
Thanks Gail. Its on some sort of virtual machine, I'll get the Client to move the whole lot somewhere else.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2011-12-08 : 09:26:22
Moved data to new disks and virtual machine "somwhere else" and my query now runs again.

Have run CHECKDB on system, tempdb and my database and all clean, so presume that the memory fault didn't actually cause any corruption of the database itself - is that a fair assumption?
Go to Top of Page

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2011-12-08 : 12:04:34
It's got nothing to do with disks. This is a memory error or a rogue process (typically kernel) that's scribbling over memory.

As for the DB being clean, maybe. This one's nasty in that it can mangle data without any way of detecting it. You're probably ok, but unless you examine (by hand) every piece of data to ensure it looks correct, there's no way to be completely and totally sure.

--
Gail Shaw
SQL Server MVP
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2011-12-09 : 03:16:03
"It's got nothing to do with disks"

Yeah I had told them that at the time ... but that's the first thing they did anyway, which took them several hours <sigh> they then asked me to retry the query and ... guess what! ... no change; less than 5 minutes later they had moved the virtual machine, so could have saved several hours of downtime.

Its a staging database. I'll set the Start Date back to "zero" which will initiate a complete re-load just to be on the safe side.

Thanks for your help, spot on as always
Go to Top of Page

SqlDex
Starting Member

1 Post

Posted - 2017-12-05 : 08:31:39
I realize this is an old thread, so I hope someone's still lurking....

When I read "memory error", I imagine an event that's happening to pages at random. I too am getting the same error (msdb in my case), and getting the error when something hits the same two pages.

And as was the case for others, DBCC CHECKDB came back clean.

Am I wrong to think a memory error would be more random?
Go to Top of Page
   

- Advertisement -