Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 SQL Server Administration (2008)
 When is it safe to restart SQL Agent?

Author  Topic 

Kristen
Test

22859 Posts

Posted - 2014-01-02 : 12:42:56
Do you check for currently running jobs? if so how? (Some simple SQL that I can run to check if a) anything it running and b) if anything is ABOUT to run in the next few minutes, before I can hit the STOP/START button )

Do you similarly do anything before stop/start the SQL service itself?

If we are doing maintenance work on SQL (lets say installing service pack, and then planning to move some database files to a different drive, and so on) we quite often set all jobs to DISABLED so that they don't kick off after rebooting and whilst we are still fiddling with things. We also set SQL Agent Server to NOT auto-start, but we are never certain that something, outside our control, won't restart it before we are ready. Its a pain to set the jobs to Disabled, and back again (as we always have some that are intentionally set to Disabled ...)

How do you handle that situation?

djj55
Constraint Violating Yak Guru

352 Posts

Posted - 2014-01-02 : 12:55:35
Look at the msdb..sysjobhistory and msdb..sysjobs tables.

djj
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2014-01-02 : 13:08:45
Also running packages folder in MSDB and also on executions report in SSMS

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/
https://www.facebook.com/VmBlogs
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2014-01-02 : 13:09:42
quote:
Originally posted by djj55

Look at the msdb..sysjobhistory and msdb..sysjobs tables.



I'm not sure but I don't think they tell me anything useful, in this context, do they?

Could be wrong, but I thought History only told me completed jobs? I thought msdb.dbo.SysJobServers was the only source of currently running jobs, or perhaps the undocumented (I think?) xp_sqlagent_enum_jobs

But my question was really a general one about what DBA's check, if anything, before stop/starting SQL Agent and/or SQL Server services, and whether anyone had some code snippet that they rely on for checking.
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2014-01-02 : 13:22:50
quote:
Originally posted by visakh16

Also running packages folder in MSDB and also on executions report in SSMS


Thanks, checking those now
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2014-01-02 : 13:42:23
quote:
Originally posted by visakh16

Also running packages folder in MSDB and also on executions report in SSMS


Never knew that Standard Reports existed in SQL Agent. Thanks for that

However, on the server I want to try it on I get an error. Something else to fix on that crappy site!

I can't find the Running Packages folder, can you give me a clue where that should be pls?

Will there be something in there for simple jobs that are set up in SQL Agent "Jobs"?, or do they have to be a more complicated "package" in SSIS or somesuch?
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-01-02 : 13:46:36
We use Microsoft SCOM for our monitoring. Sometimes the SCOM agent would hang/be dead, so we've now got a process to check that the SCOM agent is alive and have a daily report for any that aren't.

I restart the agent without any checking, but I do it at a time that I know no big log jobs are running such as an index rebuild job. If an index rebuild job is running when either the agent or the SQL service is restarted and it was working on a LARGE index, you've got a lengthy crash recovery that's going to happen for that database when the SQL service is brought online. We had this happen a while back, and it took 4 hours for it to finally finish. Though this isn't a system that I support, another system took 9 days to recover. After the first day, they restored the database to another system to get PROD back online. And then they waited to see just how long it would take the original server to finally complete. 9 days! That one was not due to an index rebuild job, but rather it was due to a developer leaving a teeny tiny transaction open before leaving for vacation. Log transactions are sequential, so there were millions of transactions that could not clear in the log due to that previous transaction.



Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2014-01-02 : 14:03:00
Thanks Tara. I'll take the current job-running state of SQL Agent seriously then!

We have jobs that kick off every few minutes. Of those there are a few that find nothing to do 99% of the time, but on odd occasions can run for more significant amounts of time. All the biggies are over night, so I should safely avoid then at any normal time during the day.

I was pondering if we should have a "Suspend all jobs" flag (i.e. a row in a Config table in our Admin database) that all non-abortable jobs could check, then I could just set that (or perhaps a "Don't start after date/time") and then be safe in the knowledge that jobs won't kick off when I want to cycle SQL Agent / SQL Service.

We do, already, have those sorts of flags for all our data integration jobs, as they tend to run for quite a long time pulling data form remote sources, and there are times when we don;t want them to be half way through anything when we reboot etc. We set them when we have strong wind storms that are likely to cause disruption (phone / power cables down) so that those processes are less likely to get stuck mid-flow, so I could lean on the back of that semaphore infrastructure I reckon.

If I manage to construct a simple query I'll post it here in case anyone is interested.
Go to Top of Page

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2014-01-02 : 21:44:17
When I need to temporarily disable all jobs I just disable SQL Agent. But seems this may be useful for you in some instances:
-- copy jobs into temporary table (not a #temp)
use msdb;
GO
SELECT job_id, enabled
INTO tmpJobs;
GO

-- disable all jobs --
UPDATE sysjobs
SET enabled = 0;
GO

-- Do whatever work you need to do --

-- enable jobs --
UPDATE j
SET enables = 1
FROM sysjobs j
JOIN tmpJobs t
On t.job_id = j.job_id
WHERE t.enabled = 1;
GO

-- clean up --
DROP TABLE tmpJobs;
GO
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-01-03 : 00:08:32
quote:
Originally posted by russell

When I need to temporarily disable all jobs I just disable SQL Agent. But seems this may be useful for you in some instances:
-- copy jobs into temporary table (not a #temp)
use msdb;
GO
SELECT job_id, enabled
INTO tmpJobs;
GO

-- disable all jobs --
UPDATE sysjobs
SET enabled = 0;
GO

-- Do whatever work you need to do --

-- enable jobs --
UPDATE j
SET enables = 1
FROM sysjobs j
JOIN tmpJobs t
On t.job_id = j.job_id
WHERE t.enabled = 1;
GO

-- clean up --
DROP TABLE tmpJobs;
GO




Doesn't that require system table updates be enabled?

I wrote this 10 years ago so not sure if it works on newer versions (probably does work), but here is what I used in our DR failover scripts: http://weblogs.sqlteam.com/tarad/archive/2003/10/17/325.aspx

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2014-01-03 : 04:42:11
Thanks both. I wrote a little query last night which told me what was running, and the next scheduled date/time and I picked a time when there was nothing running / coming up for 30 minutes and hit Restart of SQL Agent .... an hour later it still hadn't stopped

Had to make an unscheduled restart of SQL Service. That didn't work, and error message just said "Can't" rather than anything useful ... so I had to reboot the whole server. Very annoying, and weakens my confidence in the product.

All I wanted to do was stop.start SQL Agent to see if the stupid error I was getting "An attempt was made to send an email when no email session has been established" would go away as I had not been getting any Email alerts on Backup Failed, and we had had disk full for a week over the Christmas holiday.

The restart fixed the Emails ... which is even more worrying. How long until they just decide to fail again, and not then have any means of telling me.

The "Save Job's Enabled State" thing will be handy, thanks. I think I will change the "DROP TABLE tmpJobs" to a rename of the table - to include DATE - so that it can be purged later, that way I will also have a note of what was enabled / disabled at that moment in time - which might save my bacon on something else later!

By the by, the "little query" I wrote showed me Next Scheduled Run date and after the job finished that didn't update (i.e. it was past-due) and then at some future point it updated to the actual next run. I haven't looked into it, but it seemed perculiar. Maybe I'm looking at the wrong column somewhere?

-- Jobs that are running NOW! ...
SELECT A.run_requested_Date,
[Elapsed (Sec)] = DATEDIFF(SECOND, A.run_requested_Date, GetDate()),
J.Name
-- , J.Originating_Server,
-- J.job_ID

FROM msdb.dbo.sysjobs_view AS J
JOIN msdb.dbo.sysjobactivity AS A
ON A.job_id = J.job_id
JOIN msdb.dbo.syssessions AS S
ON S.session_id = A.session_id
JOIN
(
SELECT [max_agent_start_date] = MAX(agent_start_date)
FROM msdb.dbo.syssessions
) AS SMax
ON SMax.max_agent_start_date = S.agent_start_date
WHERE run_Requested_date IS NOT NULL
AND stop_execution_date IS NULL
--
-- ... and jobs that are Starting soon!!
SELECT [JobEnabled] = JOB.enabled,
--TODO These show OVERDUE for a while after job completes. I don't know what changes them to next-actual-scheduled-time
[next_run_in (mins)] = DATEDIFF(Minute, GetDate(), msdb.dbo.agent_datetime(SCH.next_run_date, SCH.next_run_time)),
[next_run_date] = CONVERT(varchar(17), msdb.dbo.agent_datetime(SCH.next_run_date, SCH.next_run_time), 113),
[JobName] = JOB.name
-- , JOB.job_id,
-- [CategoryName] = CAT.name,
-- SRV.last_run_outcome,
-- SRV.last_outcome_message,
-- [last_run_date] = msdb.dbo.agent_datetime(SRV.last_run_date, SRV.last_run_time),
-- SRV.last_run_duration,
-- [NotifyOperatorName] = OP.name,
-- OP.email_address,
-- [ScheduleName] = SSCH.name,
-- [ScheduleEnabled] = SSCH.enabled,
-- SSCH.freq_type,
-- SSCH.freq_interval,
-- SSCH.freq_subday_interval,
-- SSCH.freq_subday_type,
-- SSCH.freq_relative_interval,
-- SSCH.freq_recurrence_factor,
-- SSCH.active_start_date,
-- SSCH.active_end_date,
-- SSCH.active_start_time,
-- SSCH.active_end_time

FROM msdb.dbo.sysjobs AS job
LEFT OUTER JOIN msdb.dbo.syscategories AS cat
ON CAT.category_id = JOB.category_id
LEFT OUTER JOIN msdb.dbo.sysoperators AS op
ON OP.id = JOB.notify_page_operator_id
LEFT OUTER JOIN msdb.dbo.sysjobservers AS srv
ON SRV.job_id = JOB.job_id
LEFT OUTER JOIN msdb.dbo.sysjobschedules AS sch
ON SCH.job_id = JOB.job_id
LEFT OUTER JOIN msdb.dbo.sysschedules AS ssch
ON SSCH.schedule_id = SCH.schedule_id
WHERE JOB.enabled = 1
AND msdb.dbo.agent_datetime(SCH.next_run_date, SCH.next_run_time) < DATEADD(Minute, 60, GetDate())
ORDER BY SCH.next_run_date, SCH.next_run_time
, JobName

Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-01-03 : 15:30:26
Kristen, is the server up-to-date as far as SQL service packs AND hotfixes? How about Windows? I would probably open a case with Microsoft regarding the service not being able to be stopped and having to do a reboot. Maybe there's something amiss that can be corrected.

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2014-01-03 : 17:57:54
Tara, sysjobs isn't "protected" as a system table.
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2014-01-03 : 18:24:12
quote:
Originally posted by russell

Tara, sysjobs isn't "protected" as a system table.



Gotcha. Thank you.

Tara Kizer
SQL Server MVP since 2007
http://weblogs.sqlteam.com/tarad/
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2014-01-05 : 06:31:44
We, now, have some time scheduled to apply latest patches etc. I'll take stock once that is done as to whether we still have problems, or not. Hopefully "not" and that will be the fix.

In a separate thread I have reported that all jobs stopped working at midnight with a security error (regardless of whether the job owner was DOMAIN/UserName, my SQL login, or sysadm/sa).

But at least it emailed me, this time, that each job had failed ... including all the ones that run every minute or two!!!
Go to Top of Page
   

- Advertisement -