Gathering workflow data

Gathering data about workflow jobs.

Writing options

Gathered data can be written to a CSV or JSON file, configured with --format [csv|json] option. There is no default option and without --format there will be no output file.

$ ./multivac/gather_job_data.py --format csv

The resulting output will look like this:

job_id,job_name,branch,commit_sha,status,queued_at,started_at,completed_at,platform,runner_label,runner_name,runner_version,failure_type
7952018403,fedora_34 (gc64),master,b7cb1421c322d93dc2893ad9e827a5b4d00e265f,success,2022-08-22T12:48:45Z,2022-08-22T12:48:51Z,2022-08-22T12:58:47Z,amd64,['ubuntu-20.04-self-hosted'],ghacts-tarantool-8-16-n5,2.295.0,
7952018262,fedora_34,master,b7cb1421c322d93dc2893ad9e827a5b4d00e265f,success,2022-08-22T12:26:26Z,2022-08-22T12:26:38Z,2022-08-22T12:35:47Z,amd64,['ubuntu-20.04-self-hosted'],,2.295.0,

Same with JSON:

$ ./multivac/gather_job_data.py --format json

{
  "7918651226": {
    "job_id": 7918651226,
    "job_name": "centos_8 (gc64)",
    "status": "success",
    "queued_at": "2022-08-19T13:01:30Z",
    "started_at": "2022-08-19T13:01:41Z",
    "completed_at": "2022-08-19T13:08:35Z",
    "runner_label": [
      "ubuntu-20.04-self-hosted"
    ],
    "platform": "amd64",
    "commit_hash": "02fae15a3adb8ea450ebbe3c250a4846cf1cca69",
    "branch": "master",
    "runner_name": "ghacts-shared-8-16-n10",
    "runner_version": "2.295.0"
  },
  "7918651223": {
    "job_id": 7918651223,
    "job_name": "opensuse_15_2 (gc64)",
    "status": "failure",
    "queued_at": "2022-08-19T13:01:30Z",
    "started_at": "2022-08-19T13:01:44Z",
    "completed_at": "2022-08-19T13:08:30Z",
    "runner_label": [
      "ubuntu-20.04-self-hosted"
    ],
    "platform": "amd64",
    "commit_hash": "02fae15a3adb8ea450ebbe3c250a4846cf1cca69",
    "branch": "master",
    "runner_name": "ghacts-shared-8-16-n3",
    "runner_version": "2.295.0",
    "failure_type": "testrun_test_failed"
  }
}

Or you can store data in InfluxDB (see InfluxDB connector):

$ multivac/gather_job_data.py --format influxdb

Limiting and filtering workflows

Workflows which were skipped or cancelled won’t be processed.

To gather data from a number of most recent workflows, use --latest:

$ ./multivac/gather_job_data.py --latest 1000

To gather data for the last N days or N hours, use --since:

$ # see data for the last week (7 days)
$ ./multivac/gather_job_data.py --since 7d
$ # see data for the last 12 hours
$ ./multivac/gather_job_data.py --since 12h

Detecting workflow failure reasons

Multivac can detect types of workflow failures and calculate detailed statistics. Detailed description of known failure reasons can be found in Types of detected workflow failures.

$ ./multivac/gather_job_data.py --latest 1000 --failure-stats

total 20
package_building_error 5
unknown 4
testrun_test_failed 3
telegram_bot_error 2
integration_vshard_test_failed 1
luajit_error 1
testrun_test_hung 1
git_repo_access_error 1
dependency_autoreconf 1
tap_test_failed 1

Command --watch-failure name will return a list of jobs where the named failure has been detected, along with the links to workflow runs on GitHub and matching log lines:

$ ./multivac/gather_job_data.py --latest 1000 --watch-failure testrun_test_failed
7008229080  memtx_allocator_based_on_malloc      https://github.com/tarantool/tarantool/runs/7008229080?check_suite_focus=true
                        2022-06-22T16:27:25.7389940Z * fail: 1
6936376158  osx_12       https://github.com/tarantool/tarantool/runs/6936376158?check_suite_focus=true
                        2022-06-17T13:11:18.6461930Z * fail: 1
6933185565  fedora_34 (gc64)     https://github.com/tarantool/tarantool/runs/6933185565?check_suite_focus=true
                        2022-06-17T09:24:50.6543965Z * fail: 1

This is useful when working with yet undetected failure reasons:

$ ./multivac/gather_job_data.py --latest 1000 --watch-failure unknown

6966228368  freebsd-13   https://github.com/tarantool/tarantool/runs/6966228368?check_suite_focus=true
                        None
6947333557  freebsd-12   https://github.com/tarantool/tarantool/runs/6947333557?check_suite_focus=true
                        None

Version:

Gathering workflow data

Writing options

Limiting and filtering workflows

Detecting workflow failure reasons