Priorities with spawn config

We have two bare-metal machines running 10 instances each to get good pipeline speeds.
We utilize the spawn config to spawn 10 instances on each host (Buildkite Agent configuration v3 | Buildkite Documentation). We would like to apply priorities to schedule jobs equally on both machines.

Are there ways to let the spawn implementation assign increasing priorities to each spawned instance?

We had an issue in the past where one host wasn’t able to spawn new os threads/processes due to disk space issues. The agents weren’t timing out, as they were already running and were able to reach the buildkite backend. But they weren’t able to spawn new tasks, though they were still picking up jobs. No jobs got scheduled on the second working host for some reason.
We hope we can prevent such issues by assigning priorities to our agent instances.

Hi @Valkum You can use agent priority value to distribute the job processes across both machines that would be running the agents. Please can you let me know if you already reviewed this article on agent prioritization and this on job prioritization and let me know if this helps your use case as it provides more details on the configuration. Thanks

Hi @Valkum also wanted to add that my last comment means you would no longer use spawn for this use case because using both spawn and priority with an Agent will mean all the Agents spawned will have the same priority. If that was used in this case, it would either mean one or both of your Instances would have Agents with matching Priority which puts them in the same position as they are now. Given your use case, you can use multiple buildkite-agent start --priority X instead. This can be implemented in multiple ways e.g a a one line command loop or via the buidlkite configuration file. More details on how this can be implemented can also be found in this document shared earlier

What solution do you have in mind using the buildkite configuration file? We currently utilize systemd to start the agent (which then spawns N instances). We can clearly work something out using systemd (see below) but we would like to keep using spawn.

[Unit]
Description="Agent instance #%i"

[Service]
Type=simple
ExecStart=..... %i
[Unit]
Description=buildkite-agents
Requires=buildkite-agent@1.service buildkite-agent@2.service buildkite-agent@3.service

[Install]
WantedBy=multi-user.target

Hi @Valkum Please can you let me know how you spin up the agents with spawn so we can get a better understanding of your use case. I believe as mentioned earlier you could also use the command line and run buildkite-agent start --priority X when starting up the agent but will hope to get a better understanding when we see your process of spinning up the agents with spawn.

We have a systemd unit:

[Unit]
Description=Buildkite Agent
Documentation=https://buildkite.com/agent
After=syslog.target
After=network.target

[Service]
Type=simple
User=buildkite-agent
Environment=HOME=/var/lib/buildkite-agent
ExecStart=/usr/bin/buildkite-agent start
RestartSec=5
Restart=on-failure
RestartForceExitStatus=SIGPIPE
TimeoutStartSec=10
TimeoutStopSec=0
KillMode=process

[Install]
WantedBy=multi-user.target

and the following content in our buildkite-agent.cfg

name="%hostname-%spawn"
spawn=10
...

Hi @Valkum thanks for sharing this. I believe you can start the agent with this configuration buildkite-agent start --spawn 10 --spawn-with-priority as an example. This will result in 10 agent processes spawning and will assign a priority to each agent matching its spawn number. So with 2 different machines running your 10 agents via this start command, work will be dispatched to the agents with priority 10, then priority 9, etc. This will load balance the scheduled jobs across the machine equally

1 Like

Thanks! Works perfectly.
Just for future readers: This also works in the config file with spawn-with-priority=true
Might be worth adding this to Buildkite Agent configuration v3 | Buildkite Documentation

1 Like

Hey @Valkum

Thanks for pointing it is missing from our docs page on the Agent configuration, we will get that fixed!

Cheers,
Tom