Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Administration (2000)
 Pecurliar Error messages in SQL Error Logs

Author  Topic 

thomadma
Starting Member

8 Posts

Posted - 2003-09-04 : 11:37:26
Hi,

I need a little help deciphering exactly what my error logs on one of the production servers is actually doing. Around 9:30 AM EST today, my error log had these entries (see below):

1 LogWriter: Operating system error 2(The system cannot find the file specified.) encountered.
2 Write error during log flush. Shutting down server
3 Error: 9001, Severity: 21, State: 4
4 The log for database 'QuestSoftware' is not available..(Quest being a db used by our monitoring tool to write metrics to)
5 Database 'QuestSoftware' cannot be opened. It has been marked SUSPECT by recovery. See the SQL Server errorlog for more information.
6 fcb::close-flush: Operating system error 2(The system cannot find the file specified.) encountered.
7 Starting up database 'QuestSoftware'.

This above was repeated 5 x's this morning in the span of an hour with the only difference being the db's that had logs unavailable. I have a monitoring tool which I use for real time sql server monitoring and nothing alerted me to this issue except when I went to run a query and received the response that the log for the particular db I was running a query against was not available.

Anyway, I rebooted the server (second node of a two node clustered server a/a environment) and all seems to be okay. During the startup phase though, I got he following message in the logs for two db's:

Recovery is checkpointing database (DB Name)

I guess that's it. If any one has any information or has experienced something similar, please let me know. I believe it may be hardware related but again, nothing is logged in eventviewer or cluster admin.

Any help would be appreciated.

Maria

setbasedisthetruepath
Used SQL Salesman

992 Posts

Posted - 2003-09-04 : 12:05:05
Well, the log documents the issue fairly clearly: the log file was no longer accessible via the OS; that meant a loss of data integrity and the server stopped itself as a result.

It's most likely a hardware issue. I suppose you could have a bad device driver but that's doubtful. Have you run the diagnostic utilities available for your RAID arrays and controllers?

Jonathan
{0}
Go to Top of Page

X002548
Not Just a Number

15586 Posts

Posted - 2003-09-04 : 12:18:06
Or more simply did someone move/delete the file?



Brett

8-)

SELECT @@POST=NewId()

That's correct! It's an AlphaNumeric!
Go to Top of Page

setbasedisthetruepath
Used SQL Salesman

992 Posts

Posted - 2003-09-04 : 12:42:58
No, SQL Server would have the file open.

Jonathan
{0}
Go to Top of Page

thomadma
Starting Member

8 Posts

Posted - 2003-09-04 : 12:43:36
No, the logs were not deleted nor was access to the drive containing the logs inaccessible. I haven't run the diagnostic utilities, no. I'll ask my network admin to take a look.

Thanks,

Maria
Go to Top of Page

thomadma
Starting Member

8 Posts

Posted - 2003-09-04 : 15:25:22
9:27 AM EST {Lost Delayed-Write Data} The system was attempting to transfer file data from buffers to \Device\HarddiskVolume5. The write operation failed, and only some of the data may have been written to the file.

The above was a system generated error message of ftdisk source. I was repeated frequently until 9:47 AM EST. Then, there was another message below:

Application popup: Windows - Delayed Write Failed : Windows was unable to save all the data for the file \Device\HarddiskVolume5\$Mft. The data has been lost. This error may be caused by a failure of your computer hardware or network connection. Please try to save this file elsewhere.

Application popup: Windows - Delayed Write Failed : Windows was unable to save all the data for the file \Device\HarddiskVolume5\$BitMap. The data has been lost. This error may be caused by a failure of your computer hardware or network connection. Please try to save this file elsewhere.

Cluster Agent: The cluster resource Disk K has become degraded.
[SNMP TRAP: 15005 in CPQCLUS.MIB]


That's about it. The last message was repeated for every disk that the node uses.

I can't see anything else. Am I missing another place to look for error messages? Besides this and the utilities that are to be used, is there anything further that I can do to determine if there is a hardware problem?

Maria
Go to Top of Page

setbasedisthetruepath
Used SQL Salesman

992 Posts

Posted - 2003-09-05 : 10:24:15
Maria-
Let me repost my earlier question:

Have you run the diagnostic utilities available for your RAID arrays and controllers?

It's clear you have a hardware problem ... the issue is, on which array(s) and/or which controller(s) ...

Jonathan
{0}
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2003-09-05 : 12:26:22
Who is your vendor for your cluster solution? I would contact them immediately. Can you move the databases to another server until this problem is corrected?

Tara
Go to Top of Page
   

- Advertisement -