When Overlapping CRON Jobs Attack…

We recently had an issue, which I suspect was caused by overlapping CRON jobs. By that I mean a CRON job had not completed its run by the time it was scheduled to run again.

CRON

If you’ve used UNIX/Linux you’ve probably scheduled a task using CRON. We’ve got loads of CRON jobs on some of our systems. The problem with CRON is it doesn’t care about overlapping jobs. If you schedule something to run every 10 minutes, but the task takes 30 minutes to complete, you will get overlapping runs. In some situations this can degrade performance to the point where each run gets progressively longer, meaning there are more and more overlaps. Eventually things can go bang!

Fortunately there is a really easy solution to this. Just use “flock”.

Let’s say we have a job that runs every 10 minutes.

*/10 * * * * /u01/scripts/my_job.sh > /dev/null 2>&1

We can use flock protect it by providing a lock file. The job can only run if it can lock the file.

*/10 * * * * /usr/bin/flock -n /tmp/my_job.lockfile /u01/scripts/my_job.sh > /dev/null 2>&1

In one simple move we have prevented overlapping jobs.

Remember, each job will need a separate lock file. In the following example we have three separate scripts, so we need three separate lock files.

*/10 * * * * /usr/bin/flock -n /tmp/my_job1.lockfile /u01/scripts/my_job1.sh > /dev/null 2>&1
*/10 * * * * /usr/bin/flock -n /tmp/my_job2.lockfile /u01/scripts/my_job2.sh > /dev/null 2>&1
*/10 * * * * /usr/bin/flock -n /tmp/my_job3.lockfile /u01/scripts/my_job3.sh > /dev/null 2>&1

Oracle Scheduler (DBMS_SCHEDULER)

The Oracle Scheduler (DBMS_SCHEDULER) doesn’t suffer from overlapping jobs. The previous run must be complete before the next run can happen. If we have a really slow bit of code that takes 30 minutes to run, it is safe to schedule it to run every 10 minutes, even though it may seem a little stupid.

begin
  dbms_scheduler.create_job (
    job_name        => 'slow_job',
    job_type        => 'plsql_block',
    job_action      => 'begin my_30_min_procedure; end;',
    start_date      => systimestamp,
    repeat_interval => 'freq=minutely; interval=10; bysecond=0;',
    enabled         => true);
end;
/

The Oracle Scheduler also has a bunch of other features that CRON doesn’t have. See here.

Conclusion

I’m not a massive fan of CRON. For many database tasks I think the Oracle Scheduler is far superior. If you are going to use CRON, please use it safely. 🙂

Cheers

Tim…