Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Administration (2000)
 Troubleshooting a Failing Server

Author  Topic 

lazerath
Constraint Violating Yak Guru

343 Posts

Posted - 2005-11-08 : 10:43:28
My production SQL Server 2000 EE (sp4) had a black screen of death early Monday morning whilst I was away. Coincidentally, it happened the prior Monday at a different time. Until the first incident, the server was up for months on end.

Since yesterday morning's outage, the server has had unexpected failures 7-10 times during which SQL Server 2000 has simply stopped running (the OS stays up and SQL comes back up fairly quickly). This is a big problem and it needs to be corrected immediately.

The error logs are absolutely no help and the only thing in the event log is "The MSSQLSERVER service terminated unexpectedly". Initially, we were running SP3a + the cumulative patch (8.00.818) but thought installing sp4 might fix our hiccups after seeing Knowledge Base Article ID 840856, "FIX: The MSSQLServer service exits unexpectedly in SQL Server 2000 Service Pack 3" (http://support.microsoft.com/default.aspx?scid=kb;en-us;840856).

I'm more or less convinced this is a hardware issue due to the sudden appearance of the hard errors, but the server support team suggests contacting Microsoft. This is an IBM server and I'm sure our support contract is 24/7 onsite, but I'm not sure if they are able to diagnois the problem or if we are expected to do that.

I'm looking for any advice all you SQL Server gurus can provide. Thanks in advance.

Michael Valentine Jones
Yak DBA Kernel (pronounced Colonel)

7020 Posts

Posted - 2005-11-08 : 11:25:55
I would treat it as a hardware problem and tell the server team to treat it that way. It sounds like they are giving you the run around.





CODO ERGO SUM
Go to Top of Page

mcrowley
Aged Yak Warrior

771 Posts

Posted - 2005-11-08 : 13:03:13
What events do you have in the System event log leading up to these crashes?
Go to Top of Page

lazerath
Constraint Violating Yak Guru

343 Posts

Posted - 2005-11-08 : 13:06:58
Alright, so we are going forward with our disaster recovery plan. We have a staging server that was spec'ed out to handle production load. Here is the proposed plan:

SQL Server 2000 Production to Stage Failover
1.) Ensure there is a recent full Backup of all Production SQL Server Databases
a. If not, perform backups of Production SQL Server
2.) Copy recent backups and transaction logs to Standby SQL Server
3.) Restore Databases and transaction logs on Standby SQL Server
BEGIN OUTAGE
4.) Disconnect Production SQL Server
5.) Perform Transaction Log Backups on all Production Databases
6.) Stop SQL Server Service
7.) Reconnect Production SQL Server
8.) Copy Transaction Logs to Standby SQL Server
9.) Restore Transaction Logs to Standby SQL Server
10.) Server Support: Enable Routing of DNS alias ‘SQLPROD1’ to ‘SQLSTG1’ and add Static WINS entry for Net Bios Name resolution
11.) Flush DNS (ipconfig /flushdns) and Purge and reload the Net Bios remote
cache name table (nbtstat -R) on Severs and Workstations needing to directly access database server.
END OUTAGE
12.) Test critical database applications and SQL Agent tasks.
13.) If testing fails for one particular server, Reboot Server and retest
14.) If continued failure or widespread problems, Disable DNS and WINS Aliases,
Flush DNS and Purge Net Bios Cache, and restart Production SQL Server
Service
Go to Top of Page

lazerath
Constraint Violating Yak Guru

343 Posts

Posted - 2005-11-08 : 13:14:16
There is absolutely nothing in the System or Application Event logs leading up to the crash:

Event Type: Error
Event Source: Service Control Manager
Event Category: None
Event ID: 7034
Date: 11/8/2005
Time: 9:02:08 AM
User: N/A
Computer: SQLPROD1
Description:
The MSSQLSERVER service terminated unexpectedly. It has done this 2 time(s).

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Go to Top of Page

lazerath
Constraint Violating Yak Guru

343 Posts

Posted - 2005-11-08 : 14:29:19
Does anyone have any experience with routing traffic from one SQL Server to another? Are there any issues caused due to different server names?
Go to Top of Page

jen
Master Smack Fu Yak Hacker

4110 Posts

Posted - 2005-11-09 : 02:03:29
quote:

Does anyone have any experience with routing traffic from one SQL Server to another?



we use ini files to make the switch...
if serverA could not be reached then serverB is used
we make sure serverA=serverB, serverB is passive (no clustering, only backup and restore)

--------------------
keeping it simple...
Go to Top of Page

lazerath
Constraint Violating Yak Guru

343 Posts

Posted - 2005-11-09 : 11:20:40
Unbelievable! Turns out the problem was a user trying to submit an unusually large description on an ASP webpage. For whatever reason, each time the user attempted the submission it resulted in an error to the client and the server would then go down with the error I listed above. This problem is occuring despite the fact that we are now running the most current patch levels (sp4 on SQL, sp1 + cumulative updates for win 2003), bios and firmware on everything in the system.

Sounds like I need to open a ticket with Microsoft.
Go to Top of Page

mcrowley
Aged Yak Warrior

771 Posts

Posted - 2005-11-09 : 11:48:42
That is bizarre. Is this a case of the description being saved in a text field, but the data itself is larger than @@textsize? If the description is a varchar field, you should just get a data truncation error.
Go to Top of Page

lazerath
Constraint Violating Yak Guru

343 Posts

Posted - 2005-11-09 : 14:41:16
The data is being saved to a text field, however it is not larger than @@textsize. The surprising thing to me is that the length was around 5500 characters (which isn't all that much). Unfortunately, I haven't had much of a chance to troubleshoot so I'm not sure what exactly is causing the exception, only that the submission is indeed the culprit.
Go to Top of Page

timw86
Starting Member

1 Post

Posted - 2005-11-17 : 13:41:07
lazerath,

I have experienced a similar problem. We are running SQL Server 2000 w/ SP4 on Windows 2000 SP4. We are up to date on all patches on os and sql server. We also have the latest drivers installed. We have two instances where sql server has stopped unexpectedly, but the server did not crash. There were no dump files generated from sql server. Please continue to post your progress on working through and resolving this issue.

Thanks, timw86

Go to Top of Page
   

- Advertisement -