Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Administration (2000)
 SQL Server Agent

Author  Topic 

AskSQLTeam
Ask SQLTeam Question

0 Posts

Posted - 2005-03-08 : 07:55:28
Jason writes "Hi all,

I use the server agent to schedule backups and routine maintenance (as I assume most do). It performs flawlessly for the most part and has always provided me with feedback when something goes wrong.

Recently, the agent seems to have skipped key backup jobs and/or stopped performing its job altogether without any indication as to what happened. The jobs simply reschedule themselves and do not run. There are no error messages, no alert notifications, and no entries in either the NT event log or the SQL server error log.

The first time I noticed it, several backup jobs were missed over a period of about 3 or 4 days. I restarted the SQL server agent service and everything kicked off again. All the jobs began to run normally and continued to run normally during the week.

This morning (Monday), the first thing I did was to check the status of the SQL agent jobs. Low and behold, it had happened again. It seems that the jobs continued to execute and then skipped a key backup job. The agent continued to run through the weekend and then skipped another key backup job. After skipping this second time, the service simply ignored its schedule altogether even though it was clearly still running. I restarted the agent service and the jobs began to execute normally again.

To reiterate, I don't have any error messages to share. The NT event log does not show that the agent service stopped or restarted at anytime during the weekend when the problem occurred. Unfortunately, I restarted the agent service before checking the agent log. If it happens again I will remember to check that log.

Help?

SQL Server 2000 Standard Edition
Version 8.00.760(SP3)

running on...

NT 4.0 Sp6a High Encryption
Dual 800MHz Intel w/ 2GB RAM

Thanks,

Jason"

Kristen
Test

22859 Posts

Posted - 2005-03-08 : 08:41:24
Are the jobs set up with Maintenance Wizard, or "bespoke" in SQL Agent?

If "Maintenance Wizard" then errors [such as they are] are available in the Maintenance Wizard section in Enterprise manager - and my guess would be that you'll find that a weekend job was set up to check/fix database integrity, and it couldn't get Single User exclusive access and then just "gave up" attempting to run the rest of the backup routine in that job.

Kristen
Go to Top of Page

jason
Posting Yak Master

164 Posts

Posted - 2005-03-09 : 14:23:47
Hi Kristen,

Thanks for the reply.

There are no error messages to refer to. That's part of the problem. It was not setup using the wizard. The custom jobs I have setup (all of them) simply stop running. When I restart the service they start running again (without errors).

Without an error condition of some sort, I am at a loss as to where I should start troubleshooting.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2005-03-10 : 00:18:14
Bummer. I would suggest that you put some sort of tracing in the SQL in the job - e.g. record the START and END time to a table. Perhaps that will show you that they start, but not finish.

Other than that the only thing I can think of is Permissions - but I don't quite see how they would stop running and then resume when you restart.

If your job contains TSQL (or CMD I think) steps you can "audit" them to a text file (in the Advanced tab when you EDIT the step). Dunno what that outputs though, never tried it!

You could Stop/Start SQL Server Agent Service periodically (once a day perhaps?) using a BATch job - that would at least get it to resume before any critical process had failed for too long

Kristen
Go to Top of Page

jason
Posting Yak Master

164 Posts

Posted - 2005-03-10 : 10:34:13
That's another issue I'm having. I can't seem to pin it down to a single job. It really seems to just skip key jobs and the finally stop executing jobs altogether. Why would the agent continue to execute jobs after stalling on one and then finally stop executing jobs altogether?

It does seem to happen at around the same time (over the weekend), so I will try to set some traps around those jobs. Hopefully something will turn up.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2005-03-10 : 13:17:10
I can;t think of anything other than "setting traps". I am expecting that the jobs are starting, but then failing - e.g. from some permissions issue. But you may be right that they are not starting at all.

The Server Time isn't going screwey is it? Some automated update from a duff Time Server??

Kristen
Go to Top of Page

jason
Posting Yak Master

164 Posts

Posted - 2005-03-10 : 13:28:12
Nope. Everything appears to be functioning normally. The jobs have executed all week without issues.
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2005-03-10 : 18:02:17
Is the option selected to restart SQL Server Agent if it stops unexpectedly? This option is available if you right click on the agent in Enterprise Manager.

Tara
Go to Top of Page

jason
Posting Yak Master

164 Posts

Posted - 2005-03-10 : 18:18:29
quote:
Originally posted by tduggan

Is the option selected to restart SQL Server Agent if it stops unexpectedly? This option is available if you right click on the agent in Enterprise Manager.

Tara



No, it wasn't. I checked it. Could the agent have stopped while the agent service is still running? The agent service has never stopped.
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2005-03-10 : 18:23:26
Well the agent is the same thing as the agent service. If it never stopped, then I would think something is messed up inside the the job tables in the msdb database as that is where the job scheduler gets its information.

I see you have 760 installed for SQL Server 2000. You should be up to 818 if you have installed the security patch which does include some hotfixes as well as security stuff.

Tara
Go to Top of Page

jason
Posting Yak Master

164 Posts

Posted - 2005-03-10 : 18:27:42
Do you think this is a 760 issue?
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2005-03-10 : 18:30:11
I've never seen it before with any version of SQL Server, but it is always best to be up to date with hotfixes when you are encountering problems.

Tara
Go to Top of Page

jason
Posting Yak Master

164 Posts

Posted - 2005-03-10 : 18:33:59
Point taken. I'll get it patched post haste.
Go to Top of Page

jason
Posting Yak Master

164 Posts

Posted - 2005-03-14 : 10:55:47
Same thing happened again over the weekend.

There are errors logged in the SQL Server Agent log. Strangely, there are no entries at all for the day (3/11) that the problem occurred?!? The following error appears several times:

2005-03-12 10:33:59 - ! [LOG] Unable to read local eventlog (reason: Not enough storage is available to process this command)

Later, the following error appears in the log instead of the above error:

2005-03-12 22:09:41 - ! [371] Unable to format event 0xC000013E (reason: Not enough storage is available to process this command)

Clearly there seems to be some sort of storage issue.

I checked the 'key' job that failed to execute on the 11th and found that it did start but failed with a timeout error when pre-processing an activeX script. This same script executes on other jobs and does nothing more than terminate spids with locks on the database to be backed up. I've never had an issue with it. The SQL Server Agent did not record a failure for this job or log any info that a problem occurred. I only know this since the activeX script creates a log file of its own. The script is posted below.

=================================================================

Dim dbobj, dbconn, qry, rs, dbid, fso, cl_log

'create a log file
Set fso = CreateObject("Scripting.FileSystemObject")
Set cl_log = fso.CreateTextFile("C:\winnt\system32\batch\check_locks_8.log",true)

'set the dbid to process here
dbid = 8

dbconn = "uid=****;pwd=*****;driver={SQL Server};server=SQL_SERVER;database=MASTER;dsn="

qry = "Use Master EXEC sp_lock"

On Error Resume Next
Set dbobj = CreateObject("ADODB.Connection")
CheckError "1"
dbobj.Open dbconn
CheckError "2"

Set rs = CreateObject("ADODB.Recordset")
CheckError "3"
rs.Open qry, dbobj
CheckError "4"

If Not rs.EOF then
While NOT rs.EOF
If rs("dbid") = dbid then
spid = rs("spid")
qry = "KILL " & spid
dbobj.Execute qry
CheckError "5"
cl_log.writeline("Process ID " & spid & " was terminated from database ID " & dbid & "." & vbcrlf)
End If
rs.MoveNext
Wend
Else
err.Raise 9000,"Check Locks on 19","No data returned by stored procedure."
CheckError "6"
End If

rs.Close
Set rs = Nothing
dbobj.Close
Set dbobj = Nothing

cl_log.writeline("Script executed successfully.")
cl_log.close
Set cl_log = Nothing
Set fso = Nothing

Public Sub CheckError(intLine)
If err.number <> 0 then
cl_log.writeline("Error Number: " & err.number & vbcrlf & "Error Description: " & err.Description & _
vbcrlf & "Line No.: " & intline)
cl_log.close
Wscript.quit
End If
End Sub

====================================================

I've modified the SQL Server Agent registry values so that the log and working directory are on a volume with more available free space.

The agent did continue to run jobs after failing to execute the one from the 11th. An then, for some reason, it just stopped executing everthing. Other than the errors in the agent log, no other errors were logged in either the job history or the OS event log.

The part that really bugs me is that after restarting the agent service, I can reschedule jobs that were missed on the weekend and they execute just fine. Any ideas?
Go to Top of Page

ysweet
Starting Member

4 Posts

Posted - 2005-03-17 : 11:09:29
We have the same problem. Did you solve it?
Go to Top of Page

jason
Posting Yak Master

164 Posts

Posted - 2005-03-17 : 11:11:50
No, but it only happens on the weekend when I'm not here (of course). I will see if the last changes I made make a difference this weekend.
Go to Top of Page

anilkdanta
Starting Member

25 Posts

Posted - 2011-02-11 : 11:47:52
Hello All,

I got stuck with the same error on one of our 2005 server. This happened when a Job created from Maintenance Plan for Integriy Checks ran and failed with this error :

[LOG] Unable to read local eventlog (reason: Not enough storage is available to process this command)

Basically it is pointing to a mess in Windows Event Log.

I cleared the application log, ofcourse you may want to take a backup of the existing events logged for fishing something in future. Then I kicked off the Integrity Checks job which completed smoothly. The job history for failures writes a long error message highlighting DTS related stuff which I guess internals of SSIS packages. But will not give you a clue about what is wrong. Huhh...I am glad the Agent log, Event viewer are talking about this.

Have fun!
Go to Top of Page
   

- Advertisement -