Author |
Topic |
AskSQLTeam
Ask SQLTeam Question
0 Posts |
Posted - 2005-03-08 : 07:55:28
|
Jason writes "Hi all,I use the server agent to schedule backups and routine maintenance (as I assume most do). It performs flawlessly for the most part and has always provided me with feedback when something goes wrong.Recently, the agent seems to have skipped key backup jobs and/or stopped performing its job altogether without any indication as to what happened. The jobs simply reschedule themselves and do not run. There are no error messages, no alert notifications, and no entries in either the NT event log or the SQL server error log.The first time I noticed it, several backup jobs were missed over a period of about 3 or 4 days. I restarted the SQL server agent service and everything kicked off again. All the jobs began to run normally and continued to run normally during the week.This morning (Monday), the first thing I did was to check the status of the SQL agent jobs. Low and behold, it had happened again. It seems that the jobs continued to execute and then skipped a key backup job. The agent continued to run through the weekend and then skipped another key backup job. After skipping this second time, the service simply ignored its schedule altogether even though it was clearly still running. I restarted the agent service and the jobs began to execute normally again.To reiterate, I don't have any error messages to share. The NT event log does not show that the agent service stopped or restarted at anytime during the weekend when the problem occurred. Unfortunately, I restarted the agent service before checking the agent log. If it happens again I will remember to check that log.Help?SQL Server 2000 Standard Edition Version 8.00.760(SP3)running on...NT 4.0 Sp6a High EncryptionDual 800MHz Intel w/ 2GB RAMThanks,Jason" |
|
Kristen
Test
22859 Posts |
Posted - 2005-03-08 : 08:41:24
|
Are the jobs set up with Maintenance Wizard, or "bespoke" in SQL Agent?If "Maintenance Wizard" then errors [such as they are] are available in the Maintenance Wizard section in Enterprise manager - and my guess would be that you'll find that a weekend job was set up to check/fix database integrity, and it couldn't get Single User exclusive access and then just "gave up" attempting to run the rest of the backup routine in that job.Kristen |
|
|
jason
Posting Yak Master
164 Posts |
Posted - 2005-03-09 : 14:23:47
|
Hi Kristen,Thanks for the reply.There are no error messages to refer to. That's part of the problem. It was not setup using the wizard. The custom jobs I have setup (all of them) simply stop running. When I restart the service they start running again (without errors).Without an error condition of some sort, I am at a loss as to where I should start troubleshooting. |
|
|
Kristen
Test
22859 Posts |
Posted - 2005-03-10 : 00:18:14
|
Bummer. I would suggest that you put some sort of tracing in the SQL in the job - e.g. record the START and END time to a table. Perhaps that will show you that they start, but not finish.Other than that the only thing I can think of is Permissions - but I don't quite see how they would stop running and then resume when you restart.If your job contains TSQL (or CMD I think) steps you can "audit" them to a text file (in the Advanced tab when you EDIT the step). Dunno what that outputs though, never tried it!You could Stop/Start SQL Server Agent Service periodically (once a day perhaps?) using a BATch job - that would at least get it to resume before any critical process had failed for too long Kristen |
|
|
jason
Posting Yak Master
164 Posts |
Posted - 2005-03-10 : 10:34:13
|
That's another issue I'm having. I can't seem to pin it down to a single job. It really seems to just skip key jobs and the finally stop executing jobs altogether. Why would the agent continue to execute jobs after stalling on one and then finally stop executing jobs altogether?It does seem to happen at around the same time (over the weekend), so I will try to set some traps around those jobs. Hopefully something will turn up. |
|
|
Kristen
Test
22859 Posts |
Posted - 2005-03-10 : 13:17:10
|
I can;t think of anything other than "setting traps". I am expecting that the jobs are starting, but then failing - e.g. from some permissions issue. But you may be right that they are not starting at all.The Server Time isn't going screwey is it? Some automated update from a duff Time Server??Kristen |
|
|
jason
Posting Yak Master
164 Posts |
Posted - 2005-03-10 : 13:28:12
|
Nope. Everything appears to be functioning normally. The jobs have executed all week without issues. |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2005-03-10 : 18:02:17
|
Is the option selected to restart SQL Server Agent if it stops unexpectedly? This option is available if you right click on the agent in Enterprise Manager.Tara |
|
|
jason
Posting Yak Master
164 Posts |
Posted - 2005-03-10 : 18:18:29
|
quote: Originally posted by tduggan Is the option selected to restart SQL Server Agent if it stops unexpectedly? This option is available if you right click on the agent in Enterprise Manager.Tara
No, it wasn't. I checked it. Could the agent have stopped while the agent service is still running? The agent service has never stopped. |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2005-03-10 : 18:23:26
|
Well the agent is the same thing as the agent service. If it never stopped, then I would think something is messed up inside the the job tables in the msdb database as that is where the job scheduler gets its information. I see you have 760 installed for SQL Server 2000. You should be up to 818 if you have installed the security patch which does include some hotfixes as well as security stuff.Tara |
|
|
jason
Posting Yak Master
164 Posts |
Posted - 2005-03-10 : 18:27:42
|
Do you think this is a 760 issue? |
|
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2005-03-10 : 18:30:11
|
I've never seen it before with any version of SQL Server, but it is always best to be up to date with hotfixes when you are encountering problems.Tara |
|
|
jason
Posting Yak Master
164 Posts |
Posted - 2005-03-10 : 18:33:59
|
Point taken. I'll get it patched post haste. |
|
|
jason
Posting Yak Master
164 Posts |
Posted - 2005-03-14 : 10:55:47
|
Same thing happened again over the weekend.There are errors logged in the SQL Server Agent log. Strangely, there are no entries at all for the day (3/11) that the problem occurred?!? The following error appears several times:2005-03-12 10:33:59 - ! [LOG] Unable to read local eventlog (reason: Not enough storage is available to process this command)Later, the following error appears in the log instead of the above error:2005-03-12 22:09:41 - ! [371] Unable to format event 0xC000013E (reason: Not enough storage is available to process this command)Clearly there seems to be some sort of storage issue.I checked the 'key' job that failed to execute on the 11th and found that it did start but failed with a timeout error when pre-processing an activeX script. This same script executes on other jobs and does nothing more than terminate spids with locks on the database to be backed up. I've never had an issue with it. The SQL Server Agent did not record a failure for this job or log any info that a problem occurred. I only know this since the activeX script creates a log file of its own. The script is posted below.=================================================================Dim dbobj, dbconn, qry, rs, dbid, fso, cl_log'create a log fileSet fso = CreateObject("Scripting.FileSystemObject")Set cl_log = fso.CreateTextFile("C:\winnt\system32\batch\check_locks_8.log",true)'set the dbid to process heredbid = 8dbconn = "uid=****;pwd=*****;driver={SQL Server};server=SQL_SERVER;database=MASTER;dsn="qry = "Use Master EXEC sp_lock"On Error Resume NextSet dbobj = CreateObject("ADODB.Connection")CheckError "1"dbobj.Open dbconnCheckError "2"Set rs = CreateObject("ADODB.Recordset")CheckError "3"rs.Open qry, dbobjCheckError "4"If Not rs.EOF then While NOT rs.EOF If rs("dbid") = dbid then spid = rs("spid") qry = "KILL " & spid dbobj.Execute qry CheckError "5" cl_log.writeline("Process ID " & spid & " was terminated from database ID " & dbid & "." & vbcrlf) End If rs.MoveNext WendElse err.Raise 9000,"Check Locks on 19","No data returned by stored procedure." CheckError "6"End Ifrs.CloseSet rs = Nothingdbobj.CloseSet dbobj = Nothingcl_log.writeline("Script executed successfully.")cl_log.closeSet cl_log = NothingSet fso = NothingPublic Sub CheckError(intLine) If err.number <> 0 then cl_log.writeline("Error Number: " & err.number & vbcrlf & "Error Description: " & err.Description & _ vbcrlf & "Line No.: " & intline) cl_log.close Wscript.quit End IfEnd Sub====================================================I've modified the SQL Server Agent registry values so that the log and working directory are on a volume with more available free space. The agent did continue to run jobs after failing to execute the one from the 11th. An then, for some reason, it just stopped executing everthing. Other than the errors in the agent log, no other errors were logged in either the job history or the OS event log.The part that really bugs me is that after restarting the agent service, I can reschedule jobs that were missed on the weekend and they execute just fine. Any ideas? |
|
|
ysweet
Starting Member
4 Posts |
Posted - 2005-03-17 : 11:09:29
|
We have the same problem. Did you solve it? |
|
|
jason
Posting Yak Master
164 Posts |
Posted - 2005-03-17 : 11:11:50
|
No, but it only happens on the weekend when I'm not here (of course). I will see if the last changes I made make a difference this weekend. |
|
|
anilkdanta
Starting Member
25 Posts |
Posted - 2011-02-11 : 11:47:52
|
Hello All,I got stuck with the same error on one of our 2005 server. This happened when a Job created from Maintenance Plan for Integriy Checks ran and failed with this error :[LOG] Unable to read local eventlog (reason: Not enough storage is available to process this command)Basically it is pointing to a mess in Windows Event Log.I cleared the application log, ofcourse you may want to take a backup of the existing events logged for fishing something in future. Then I kicked off the Integrity Checks job which completed smoothly. The job history for failures writes a long error message highlighting DTS related stuff which I guess internals of SSIS packages. But will not give you a clue about what is wrong. Huhh...I am glad the Agent log, Event viewer are talking about this.Have fun! |
|
|
|