Gathering workflow data
Gathering data about workflow jobs.
Writing options
Gathered data can be written to a CSV or JSON file,
configured with --format [csv|json]
option.
There is no default option and without --format
there will be no output file.
$ ./multivac/gather_job_data.py --format csv
The resulting output will look like this:
job_id,job_name,branch,commit_sha,status,queued_at,started_at,completed_at,platform,runner_label,runner_name,runner_version,failure_type
7952018403,fedora_34 (gc64),master,b7cb1421c322d93dc2893ad9e827a5b4d00e265f,success,2022-08-22T12:48:45Z,2022-08-22T12:48:51Z,2022-08-22T12:58:47Z,amd64,['ubuntu-20.04-self-hosted'],ghacts-tarantool-8-16-n5,2.295.0,
7952018262,fedora_34,master,b7cb1421c322d93dc2893ad9e827a5b4d00e265f,success,2022-08-22T12:26:26Z,2022-08-22T12:26:38Z,2022-08-22T12:35:47Z,amd64,['ubuntu-20.04-self-hosted'],,2.295.0,
Same with JSON:
$ ./multivac/gather_job_data.py --format json
{
"7918651226": {
"job_id": 7918651226,
"job_name": "centos_8 (gc64)",
"status": "success",
"queued_at": "2022-08-19T13:01:30Z",
"started_at": "2022-08-19T13:01:41Z",
"completed_at": "2022-08-19T13:08:35Z",
"runner_label": [
"ubuntu-20.04-self-hosted"
],
"platform": "amd64",
"commit_hash": "02fae15a3adb8ea450ebbe3c250a4846cf1cca69",
"branch": "master",
"runner_name": "ghacts-shared-8-16-n10",
"runner_version": "2.295.0"
},
"7918651223": {
"job_id": 7918651223,
"job_name": "opensuse_15_2 (gc64)",
"status": "failure",
"queued_at": "2022-08-19T13:01:30Z",
"started_at": "2022-08-19T13:01:44Z",
"completed_at": "2022-08-19T13:08:30Z",
"runner_label": [
"ubuntu-20.04-self-hosted"
],
"platform": "amd64",
"commit_hash": "02fae15a3adb8ea450ebbe3c250a4846cf1cca69",
"branch": "master",
"runner_name": "ghacts-shared-8-16-n3",
"runner_version": "2.295.0",
"failure_type": "testrun_test_failed"
}
}
Or you can store data in InfluxDB (see InfluxDB connector):
$ multivac/gather_job_data.py --format influxdb
Limiting and filtering workflows
Workflows which were skipped or cancelled won’t be processed.
To gather data from a number of most recent workflows, use --latest
:
$ ./multivac/gather_job_data.py --latest 1000
To gather data for the last N days or N hours, use --since
:
$ # see data for the last week (7 days)
$ ./multivac/gather_job_data.py --since 7d
$ # see data for the last 12 hours
$ ./multivac/gather_job_data.py --since 12h
Detecting workflow failure reasons
Multivac can detect types of workflow failures and calculate detailed statistics. Detailed description of known failure reasons can be found in Types of detected workflow failures.
$ ./multivac/gather_job_data.py --latest 1000 --failure-stats
total 20
package_building_error 5
unknown 4
testrun_test_failed 3
telegram_bot_error 2
integration_vshard_test_failed 1
luajit_error 1
testrun_test_hung 1
git_repo_access_error 1
dependency_autoreconf 1
tap_test_failed 1
Command --watch-failure name
will return a list of jobs where the named failure has been detected,
along with the links to workflow runs on GitHub and matching log lines:
$ ./multivac/gather_job_data.py --latest 1000 --watch-failure testrun_test_failed
7008229080 memtx_allocator_based_on_malloc https://github.com/tarantool/tarantool/runs/7008229080?check_suite_focus=true
2022-06-22T16:27:25.7389940Z * fail: 1
6936376158 osx_12 https://github.com/tarantool/tarantool/runs/6936376158?check_suite_focus=true
2022-06-17T13:11:18.6461930Z * fail: 1
6933185565 fedora_34 (gc64) https://github.com/tarantool/tarantool/runs/6933185565?check_suite_focus=true
2022-06-17T09:24:50.6543965Z * fail: 1
This is useful when working with yet undetected failure reasons:
$ ./multivac/gather_job_data.py --latest 1000 --watch-failure unknown
6966228368 freebsd-13 https://github.com/tarantool/tarantool/runs/6966228368?check_suite_focus=true
None
6947333557 freebsd-12 https://github.com/tarantool/tarantool/runs/6947333557?check_suite_focus=true
None