Distributing work evenly

We’re testing devices and have one agent per channel in our test environment. When multiple builds are generated for a particular agent population and pipeline, it looks like that work isn’t getting evenly distributed across the eligible agents. For example, you have 10 agents with a foo=true metadata tag and continuously ensure that there are always at least 3 build requests targeted at those agents. I was sort of expecting the builds to be evenly distributed (over time) across all 10 of agents, but the actual result is that certain agents tend to get the work.

Am I completely off base here? If not, is there some sort of workaround? I’m kind of thinking I’ll need to do something like make an API call in one pipeline to identify the agent population and then use that info to schedule builds for each matching agent in a separate pipeline (after adding something to ensure that I can target each of those requests to a specific hostname.)

Thanks

John

Hi @johnf,

The assignment algorithm will distribute work to whichever agents are applicable, with a slight bias toward agents which recently completed a successful build. This is to aid in the use of already warmed caches for places where that’s important.

I don’t believe it should be an issue which unreasonably distributes the workload you describe, unless the jobs are particularly short, in which case I could see it distributing to agents which accepted the work earlier. I could take a deeper peek and see if it’s behaving unexpectedly if you send a link to the build through to support@buildkite.com

Hi Jess,

Here’s an example:

https://buildkite.com/juullabs/charge-and-temp-monitoring-test/builds/42#784eb7db-c1cc-428b-98ba-e9f4b63dacf4

All of jobs listed on this page are running against the same agent. The config=fg tag used to assign the work is typically pretty common. It varies over time, but it’s probably reasonable to say that there are always generally at least a dozen agents with config=fg running in our stress rack. I think the build traffic in the rack is relatively light and jobs can run anywhere from just a few mins to 30 mins or so.

Hey @johnf, hmm is that the correct build, I am only seeing one job there?

Did you want to send us a message through to support@buildkite.com and we can take a look at this for you?