Agent Dashboard Sorting (version, job count, status)

cb-dr · October 14, 2020, 12:11am

It would be very useful for folks with hundreds/thousands of agents to have sorting capabilities on the Agent Dashboard ( buildkite.com/organizations/:org/agents ).

Sorting by version.
Sorting by OS.
Sorting by job count.
Sorting by status (connected).

sj26 · October 14, 2020, 12:25am

These suggestions are great!

At this sort of cardinality I can see more filtering and sorting being important. Are these particular problems you’re trying to solve with these options?

cb-dr · October 14, 2020, 12:30am

Running hundreds/thousands of agents and wanting to be able to passively review the state of them.

As the team responsible for owning/managing thousands of agents there is obviously no shortage of active monitoring that alerts of anomalies (too few agents, spikes in volume, et cetera). This won’t be perfect though and will need continuous adjustment.

Being able to have a view of overall health of agents is likely the best form / desire here more-so than sorting but sorting helps solve it if that is more trivial to implement/request.

Do we have any agents behind on versions? Why? How many?
Are we running any unexpected OS versions? Did our OS bump miss any hosts?
Any hosts not taking jobs for some reason?
Any anomalies in status that our monitoring didn’t pickup or we need to add to a monitor?

An argument could be made to instead export all of this to an Observability product and build this out there.

sj26 · October 14, 2020, 12:37am

Yeah, we’ve had lots of folks do this sort of work via Datadog or similar platforms, and there’s ongoing work to improve it. We also have an AWS EventBridge integration which allows ingesting a lot of this stuff into the AWS family of tools for analysis. That might be the best way for now. But this feedback is wonderful, and will help us figure out better built-in tools.

cb-dr · October 14, 2020, 12:40am

Makes sense - we will keep going down that route / improving that path at this time.

Thank you-

Topic		Replies	Views
Pipeline Dashboard Sorting (reliability, speed, builds/week) Features Requests	4	568	October 14, 2020
Queue wait times metric Features Requests	10	1480	September 7, 2023
More job list tabs Features Requests	3	308	July 2, 2023
Timeout waiting for agent Features Requests	15	4171	December 13, 2023
Viewing the Queue of unassigned jobs General	1	217	January 26, 2024

Agent Dashboard Sorting (version, job count, status)

Related topics