Distributing work evenly

johnf · April 12, 2021, 10:57pm

We’re testing devices and have one agent per channel in our test environment. When multiple builds are generated for a particular agent population and pipeline, it looks like that work isn’t getting evenly distributed across the eligible agents. For example, you have 10 agents with a foo=true metadata tag and continuously ensure that there are always at least 3 build requests targeted at those agents. I was sort of expecting the builds to be evenly distributed (over time) across all 10 of agents, but the actual result is that certain agents tend to get the work.

Am I completely off base here? If not, is there some sort of workaround? I’m kind of thinking I’ll need to do something like make an API call in one pipeline to identify the agent population and then use that info to schedule builds for each matching agent in a separate pipeline (after adding something to ensure that I can target each of those requests to a specific hostname.)

Thanks

John

jess · April 12, 2021, 11:56pm

Hi @johnf,

The assignment algorithm will distribute work to whichever agents are applicable, with a slight bias toward agents which recently completed a successful build. This is to aid in the use of already warmed caches for places where that’s important.

I don’t believe it should be an issue which unreasonably distributes the workload you describe, unless the jobs are particularly short, in which case I could see it distributing to agents which accepted the work earlier. I could take a deeper peek and see if it’s behaving unexpectedly if you send a link to the build through to support@buildkite.com

johnf · April 19, 2021, 9:15pm

Hi Jess,

Here’s an example:

https://buildkite.com/juullabs/charge-and-temp-monitoring-test/builds/42#784eb7db-c1cc-428b-98ba-e9f4b63dacf4

All of jobs listed on this page are running against the same agent. The config=fg tag used to assign the work is typically pretty common. It varies over time, but it’s probably reasonable to say that there are always generally at least a dozen agents with config=fg running in our stress rack. I think the build traffic in the rack is relatively light and jobs can run anywhere from just a few mins to 30 mins or so.

Jason · April 21, 2021, 12:01am

Hey @johnf, hmm is that the correct build, I am only seeing one job there?

Did you want to send us a message through to support@buildkite.com and we can take a look at this for you?

Topic		Replies	Views
Isn't All Agents Supposed To Execute All Jobs? General	2	272	August 29, 2023
Waiting for a buildkite agent to become available General	1	820	March 22, 2023
Agent maintenance: target every agent matching tag? General	3	552	March 20, 2025
Schedule policy of steps across nodes / VM's Pipelines	5	22	August 1, 2024
Pinning all pipeline build steps to the agent on the same machine Elastic CI Stack for AWS	5	2097	July 7, 2022

Distributing work evenly

Related topics