Job Management

New in version 0.9.7.

Since Salt executes jobs running on many systems, Salt needs to be able to manage jobs running on many systems.

The Minion proc System

Salt Minions maintain a proc directory in the Salt cachedir. The proc directory maintains files named after the executed job ID. These files contain the information about the current running jobs on the minion and allow for jobs to be looked up. This is located in the proc directory under the cachedir, with a default configuration it is under /var/cache/salt/proc.

Functions in the saltutil Module

Salt 0.9.7 introduced a few new functions to the saltutil module for managing jobs. These functions are:

  1. running Returns the data of all running jobs that are found in the proc directory.
  2. find_job Returns specific data about a certain job based on job id.
  3. signal_job Allows for a given jid to be sent a signal.
  4. term_job Sends a termination signal (SIGTERM, 15) to the process controlling the specified job.
  5. kill_job Sends a kill signal (SIGKILL, 9) to the process controlling the specified job.

These functions make up the core of the back end used to manage jobs at the minion level.

The jobs Runner

A convenience runner front end and reporting system has been added as well. The jobs runner contains functions to make viewing data easier and cleaner.

The jobs runner contains a number of functions...

active

The active function runs saltutil.running on all minions and formats the return data about all running jobs in a much more usable and compact format. The active function will also compare jobs that have returned and jobs that are still running, making it easier to see what systems have completed a job and what systems are still being waited on.

# salt-run jobs.active

lookup_jid

When jobs are executed the return data is sent back to the master and cached. By default it is cached for 24 hours, but this can be configured via the keep_jobs option in the master configuration. Using the lookup_jid runner will display the same return data that the initial job invocation with the salt command would display.

# salt-run jobs.lookup_jid <job id number>

list_jobs

Before finding a historic job, it may be required to find the job id. list_jobs will parse the cached execution data and display all of the job data for jobs that have already, or partially returned.

# salt-run jobs.list_jobs

Scheduling Jobs

In Salt versions greater than 0.12.0, the scheduling system allows incremental executions on minions or the master. The schedule system exposes the execution of any execution function on minions or any runner on the master.

Scheduling can be enabled by multiple methods:

  • schedule option in either the master or minion config files. These require the master or minion application to be restarted in order for the schedule to be implemented.
  • Minion pillar data. Schedule is implemented by refreshing the minion's pillar data, for example by using saltutil.refresh_pillar.
  • The schedule state or schedule module

Note

The scheduler executes different functions on the master and minions. When running on the master the functions reference runner functions, when running on the minion the functions specify execution functions.

A scheduled run has no output on the minion unless the config is set to info level or higher. Refer to minion logging settings.

Specify maxrunning to ensure that there are no more than N copies of a particular routine running. Use this for jobs that may be long-running and could step on each other or otherwise double execute. The default for maxrunning is 1.

States are executed on the minion, as all states are. You can pass positional arguments and provide a yaml dict of named arguments.

The below example will schedule the command state.apply httpd test=True every 3600 seconds (every hour):

schedule:
  job1:
    function: state.apply
    seconds: 3600
    args:
      - httpd
    kwargs:
      test: True

This next example will schedule the command state.apply httpd test=True every 3600 seconds (every hour) splaying the time between 0 and 15 seconds:

schedule:
  job1:
    function: state.apply
    seconds: 3600
    args:
      - httpd
    kwargs:
      test: True
    splay: 15

Finally, the next example will schedule the command state.apply httpd test=True every 3600 seconds (every hour) splaying the time between 10 and 15 seconds:

schedule:
  job1:
    function: state.apply
    seconds: 3600
    args:
      - httpd
    kwargs:
      test: True
    splay:
      start: 10
      end: 15

New in version 2014.7.0.

Frequency of jobs can also be specified using date strings supported by the python dateutil library. This requires python-dateutil to be installed on the minion.

For example, this will schedule the command state.apply httpd test=True at 5:00pm localtime on the minion.

schedule:
  job1:
    function: state.apply
    args:
      - httpd
    kwargs:
      test: True
    when: 5:00pm

To schedule the command state.apply httpd test=True at 5pm on Monday, Wednesday, and Friday, and 3pm on Tuesday and Thursday, use the following:

schedule:
  job1:
    function: state.apply
    args:
      - httpd
    kwargs:
      test: True
    when:
        - Monday 5:00pm
        - Tuesday 3:00pm
        - Wednesday 5:00pm
        - Thursday 3:00pm
        - Friday 5:00pm

Time ranges are also supported. For example, the below configuration will schedule the command state.apply httpd test=True every 3600 seconds (every hour) between the hours of 8am and 5pm. The range parameter must be a dictionary with the date strings using the dateutil format.

schedule:
  job1:
    function: state.apply
    seconds: 3600
    args:
      - httpd
    kwargs:
      test: True
    range:
        start: 8:00am
        end: 5:00pm

Note

Using time ranges requires python-dateutil to be installed on the minion.

New in version 2014.7.0.

The scheduler also supports ensuring that there are no more than N copies of a particular routine running. Use this for jobs that may be long-running and could step on each other or pile up in case of infrastructure outage.

The default for maxrunning is 1.

schedule:
  long_running_job:
      function: big_file_transfer
      jid_include: True

run_on_start

New in version 2015.5.0.

By default, any job scheduled based on the startup time of the minion will run the scheduled job when the minion starts up. Sometimes this is not the desired situation. Using the run_on_start parameter set to False will cause the scheduler to skip this first run and wait until the next scheduled run.

schedule:
  job1:
    function: state.sls
    seconds: 3600
    run_on_start: False
    args:
      - httpd
    kwargs:
      test: True

States

schedule:
  log-loadavg:
    function: cmd.run
    seconds: 3660
    args:
      - 'logger -t salt < /proc/loadavg'
    kwargs:
      stateful: False
      shell: /bin/sh

Highstates

To set up a highstate to run on a minion every 60 minutes set this in the minion config or pillar:

schedule:
  highstate:
    function: state.apply
    minutes: 60

Time intervals can be specified as seconds, minutes, hours, or days.

Runners

Runner executions can also be specified on the master within the master configuration file:

schedule:
  run_my_orch:
    function: state.orchestrate
    hours: 6
    splay: 600
    args:
      - orchestration.my_orch

The above configuration is analogous to running salt-run state.orch orchestration.my_orch every 6 hours.

Scheduler With Returner

The scheduler is also useful for tasks like gathering monitoring data about a minion, this schedule option will gather status data and send it to a MySQL returner database:

schedule:
  uptime:
    function: status.uptime
    seconds: 60
    returner: mysql
  meminfo:
    function: status.meminfo
    minutes: 5
    returner: mysql

Since specifying the returner repeatedly can be tiresome, the schedule_returner option is available to specify one or a list of global returners to be used by the minions when scheduling.