Yesterday morning I received notice that a SQL Server job for an application I support that normally ran fairly regularly was not working. This job was not kicked off by a manual schedule, but was called from elsewhere.
I was able to run the job manually with no problem at all. Nevertheless, the job did not start running “automatically” as expected. After some digging, I found out that the job was kicked off in a database trigger on INSERT.
The section of code that kicked off the job looked like this:
SET @JOB_NAME = N'My job name';
IF NOT EXISTS(
select 1
from msdb.dbo.sysjobs_view job
inner join msdb.dbo.sysjobactivity activity on job.job_id = activity.job_id
where
activity.run_Requested_date is not null
and activity.stop_execution_date is null
and job.name = @JOB_NAME
)
BEGIN
PRINT 'Starting job ''' + @JOB_NAME + '''';
EXEC msdb.dbo.sp_start_job @JOB_NAME;
END
ELSE
BEGIN
PRINT 'Job ''' + @JOB_NAME + ''' is already started ';
END
So this code should only return a row (and prevent job execution if the job is already running. The job was not running, and yet a row came back saying that the job had started two days before and never finished.
It just so happened that the SQL Server cluster had to be rebooted at just that time. The job appeared to be running, but was not.
The first solution: Restart the SQL Server Agent. This had no effect on the sysjobactivity table.
The second solution: Restart the entire SQL Server instance. This was not done, due to critical operations at work.
What else could be done?
Manually updating the table to reflect that the job was no longer running!
DECLARE @job_name VARCHAR(2000) = 'Your job name';
DECLARE @job_id uniqueidentifier =
(SELECT job_id
FROM msdb.dbo.sysjobs
WHERE NAME LIKE @job_name);
UPDATE msdb.dbo.sysjobactivity
SET stop_execution_date = GETDATE()
WHERE job_id = @job_id
AND start_execution_date =
(SELECT MAX(start_execution_date)
FROM msdb.dbo.sysjobactivity
WHERE job_id = @job_id
AND run_Requested_date IS NOT NULL
AND stop_execution_date IS NULL);
EXEC sp_stop_job @job_id = @job_id;
Running this block of code enters a time (now) for job execution end, and also manually sends a job stop command in case the job had started in the meantime.
After this, the trigger worked as expected!