Pipelines dashboard: custom pipeline metrics

The pipelines dashboard currently displays 3 metrics per pipeline:

  • Speed,
  • Reliability,
  • Builds per week

I would love to have a way to personalize what metrics are displayed per pipeline and be able to display custom metrics such as code coverage for unit tests.

Hi @pbarrau! Thanks for this, it’s an interesting idea. I’m really curious as to what information you’d personally prioritise here, in addition to the aforementioned code coverage metric.

Hi Eleanor,

Test coverage is what I’d need the most. It would let me see at a glance which projects don’t have enough tests, and public shame the owning team :smiley:

Now, we could imagine many other metrics worth monitoring: the time to perform a specific step in the pipeline, the size of the artifacts, …

Here is a proposal on how this could work:

I believe we could leverage the meta-data functionality to define builds custom metrics. Something like that:

# pipeline.yaml

steps:
  - label: Run tests
     command: run_tests.sh

statistics:
  - label: Code coverage
     unit: percent
     meta-data: code_coverage
# run_tests.sh

<perform tests + collect coverage metric>

buildkite-agent meta-data set "code_coverage" "$COVERAGE"
1 Like

Agent wait time would be huge! Average job time would also be pretty useful.

Job retries (basically https://buildkite.com/organizations/org/reports/job-retries) could also be really convenient there.

Could even see number of skipped builds being useful.

Also, org-wide metrics would be really helpful. First that come to mind are pending/running builds + number of pending steps. This might even be even more of a separate feature request, but an org-wide equivalent of the buildkite.com/applied-intuition/org/builds?state=running page (adding it here in case it makes sense in a sidebar :) ).

I would like to see these pipeline-specific metrics like total time taken by step. This would really help us know where we need to improve.

@kogulan thanks for the message and welcome to the community!

Have you explored the GraphQL API, or the REST API? Job times, retries, and so on can be calculated using the API response.

Cheers,

Ben