Version:

Modules reference / task / Task Manager for Tarantool Enterprise
Modules reference / task / Task Manager for Tarantool Enterprise

Task Manager for Tarantool Enterprise

Task Manager for Tarantool Enterprise

@lookup README.md

Task manager module allows you to automate several types of background jobs:

  • periodical, that need to be launched according to cron-like schedule;
  • continuous, that need to be working at all times;
  • single_shot, that are launched manually by the operations team.

You get the following features out-of-the-box:

  • configurable schedule of periodic tasks;
  • guarding and restarting continuous tasks;
  • stored log of task launches;
  • API and UI for launching tasks and observing launch results.

Task manager comes with several built-in cluster roles:

  • task.roles.scheduler, a module which allows to configure launchable tasks;
  • task.roles.runner, a cluster-aware stateless task runner;
  • task.roles.storage, a cluster-aware dedicated task contents storage;
  • plugin-based API which allows you to provide you own storage module (e. g. distributed, or external to Tarantool cluster), or your own runner module, providing more tooling for your needs.

Basic usage (single-node application)

  1. Embed the following to the instance file:
...
local task = require('task')
...
cartridge.cfg({
   roles = {'my_role', ...}
})
task.init_webui()
  1. Add to your role dependency on task scheduler, runner and storage roles:
return {
    ...
    dependencies = {
        'task.roles.storage',
        'task.roles.scheduler',
        'task.roles.runner'
    }
}
  1. Add tasks section to your cluster configuration:
tasks:
  my_task:
    kind: periodical
    func_name: my_module.my_task
    schedule: "*/* 1 * * * *"
  1. That’s it! my_task function will be launched every minute.

Advanced usage (multi-node installation)

  1. Embed the following to the instance file:
...

...
local task = require('task')
cartridge.cfg({
   roles = {
    ...
    'task.roles.scheduler',
    'task.roles.storage',
    'task.roles.runner'
   }
})

task.init_webui()
  1. Enable the task scheduler role on a dedicated node in your cluster (after deployment). If you set up a big cluster, don’t set up more than one replica set with the scheduler.
  2. Enable the task storage role on a dedicated node in your cluster (after deployment), possibly on the same node as task scheduler. If you set up a big cluster, don’t set up more than one replica set with the storage.
  3. Enable the task runner role on dedicated stateless nodes in your cluster (after deployment) - as many as you may need.

Advanced usage (sharded storage)

  1. Embed the following to the instance file:
...
local task = require('task')
cartridge.cfg({
   roles = {
    ...
    'task.roles.sharded.scheduler',
    'task.roles.sharded.storage',
    'task.roles.sharded.runner'
   }
})

task.init_webui()
  1. Enable the task scheduler role on a dedicated node in your cluster (after deployment). If you set up a big cluster, don’t set up more than one replica set with the scheduler.
  2. Enable the task storage role on the nodes of some vshard group (or an all storage nodes). Set up cartridge built-in vshard-storage role on these nodes.
  3. Enable the task runner role on dedicated stateless nodes in your cluster (after deployment) - as many as you may need.

Tasks configuration

Tasks are configured via the scheduler cluster role. An example of valid role configuration:

tasks:
    my_reload:
      kind: periodical
      func_name: my_module.cache_reload
      schedule: "*/* 1 * * * *"
      time_to_resolve: 180
    my_flush:
      kind: single_shot
      func_name: my_module.cache_flush
      args:

        - some_string1
        - some_string2
    push_metrics:
      kind: continuous
      func_name: my_module.push_metrics
      pause_sec: 30
  • Every task must have a unique name (subsection name in config).
  • Each task must have a kind: periodical, continuous, single_shot.
  • Each task must have a func_name - name of the function (preceded by the name of the module) which will be invoked.
  • Each task may have time_to_resolve - timeout after which a running task is considered lost (failed).
  • Each task may have args - an array of arguments which will be passed to the function (to allow basic parametrization)
  • Periodical tasks also must have a schedule, conforming with ccronexpr (basically, cron with seconds).
  • Continuous tasks may have a pause_sec - pause between launches (60 seconds by default).

You may set up default task config for your application in task.init() call:

task.init({
    default_config = {
        my_task = {
            kind = 'single_shot',
            func_name = 'dummy_task.dummy',
        }
    }
})

Default config will be applied if no tasks are set in clusterwide config. task.init() should be called prior to cluster.cfg().

Advanced usage

Running a task via API

Everything visible from the UI is available via the API. You may look up requests in the UI or in the cartridge graphql schema.

curl -w "\n" -X POST http://127.0.0.1:8080/admin/api --fail -d@- <<'QUERY'
{"query": "mutation { task { start(name: "my_reload") } }"}
QUERY

Supplying custom runner and storage

Embed the following to your instance file

task.init({
    runner = 'my_tasks.my_runner',
    storage = 'my_tasks.my_storage'
})
...
cartridge.cfg{...}
task.init_webui()

Be sure to call task.init() it prior to cartridge.cfg, so that custom options would be provided by the time role initialization starts.

You may set up then only task scheduler role, and handle storages and runners yourself.

Writing your own runner and storage

Runner module must expose api member with the following functions:

  • stop_task
storage module must expose api member with the following functions:
  • select
  • get
  • delete
  • put
  • complete
  • take
  • cancel
  • wait

For more details refer to built-in runner and storage documentation