Limit jobs an agent can handle?

I would love to be able to be able to restrict what jobs run on an agent. I can add agent tags but adding tags only expands the jobs the agent will accept. Suppose I have a two agents A and B with tags a and a, b, respectively. Now I want to have agent B only run jobs with agent tag b. I will not be able to restrict the jobs to agent B without modifying agent A to some negative test for tag a. Is there any way to restrict jobs accepted by an agent without modifying all the other agents?

I found two prior issues/feature requests that almost capture my issue, but I can’t quite tell if they are asking for.

Hi @john.soo

We are currently working on feature (Clusters) which exactly addresses this kind of usecase that is also referenced in one of the forum link which you added to your post. We do not have an exact ETA at the moment but should be available this year.

So currently there is no simple way apart from modifying agents or having dedicated agents for specific jobs

1 Like

I am really glad something to address this is coming, but I didn’t really any details about what the feature would add. Is there some kind of example agent configuration that is available? I ask because provisioning a new cluster just for this feels very heavy to me.

Clusters allow you to allocate sets of agents to run specific kinds of jobs within your organization. On high level as the feature is still being worked on, you will define a logical entity which is Cluster in Buildkite and in that cluster you will have pipelines, agents with different queues.

So now builds from the pipelines in that cluster will only be targeting agents that are part of that cluster. I hope this provides some overview about this feature and helps with your question.

1 Like

Hm. Well my concern is that I want to have jobs in the same build in the same pipeline run on more specific agents. Will having multiple clusters per build/pipeline be supported?

Valid concern. Please allow me to follow up on this and get back to you.

1 Like

Hi @john.soo

Team is still on working on the feature and at the moment we do not have the details on all the specifics that will be available as part of the feature.

1 Like

Ok thank you! Consider this just my 2 cents.

Here’s what I thought would work for me. Like jobs can specify agent tags, agents might specify job tags. I imagine they might work the same way the agent tags work for consistency’s sake. I don’t mean to dictate a solution or anything, but that seems to me to be the missing feature.

Hi @john.soo

Thank you for the inputs.

1 Like

Thank you @suma I appreciate the fast response and look forward to what you come up with!

Clusters do not solve this. Is there any chance this can get another review, please?

Hey John :wave: , I would like to get more info on your use case. As mentioned in your earlier ask,

Suppose I have a two agents A and B with tags a and a, b , respectively. Now I want to have agent B only run jobs with agent tag b . I will not be able to restrict the jobs to agent B without modifying agent A to some negative test for tag a

I am just wondering if one way to restrict jobs to Agent is giving them low priorities such that agents with tag a,b will only accept those jobs if there are no more jobs allocated for agent A.

We can’t use priorities, unfortunately. We already use priorities for actual prioritization (or plan to very soon) for their intended purpose.

Moreover, it would not be enough to use priorities since some agents should not be able to accept jobs because some agents simply can’t run the jobs of other agents. For us this means macOS vs Linux, but I can see this being very useful for other things - end-to-end tests vs builds, say.

Moreover, the build matrix will not work for us because we do not know up-front which jobs will be scheduled for any particular build. We make heavy use of buildkite-agent upload (it’s a killer feature!) and there is simply no way to know which platform will need to build before the upload. Also, the jobs for each platform will be different, so a matrix, even with adjustments, simply will not work for us.

Without the ability to restrict the kinds of jobs an agent can accept, we have quite a lot of CI capacity issues. We run a lot of agents on a central planning machine and have each job trigger a remote build. We have limited macOS build capacity so it is easy to get into a scenario like the following:

10 agents on the planning machine
2 macOS build machines
8 Linux build machines

build1 uploaded with 4 macOS jobs, 10 linux jobs

job queue: (build1, linux6), (build1, linux7), ...

agent1: (build1, macos1)
agent2: (build1, macos2)
agent3: (build1, macos3)
agent4: (build1, macos4)
agent5: (build1, linux1)
agent6: (build1, linux2)
agent7: (build1, linux3)
agent8: (build1, linux4)
agent9: (build1, linux5)
agent10: (build1, linux6)

builders:
macos1: (build1, macos1)
macos2: (build1, macos2)
linux1: (build1, linux1)
linux2: (build1, linux2)
linux3: (build1, linux3)
linux4: (build1, linux4)
linux5: (build1, linux5)
linux6: idle
linux7: idle
linux8: idle

This situation only gets worse because new builds after build1 can easily take up all the agents waiting for the slow macOS builds. What we really need is to reserve some agents for macos only, so the situation would look like this:

10 agents on the planning machine
2 macOS build machines
8 Linux build machines

build1 uploaded with 4 macOS jobs, 10 linux jobs

job queue: (build1, macos3), (build1, macos4), (build1, linux9), (build1, linux10)

agent1: (build1, macos1)
agent2: (build1, macos2)
agent3: (build1, linux1)
agent4: (build1, linux2)
agent5: (build1, linux3)
agent6: (build1, linux4)
agent7: (build1, linux5)
agent8: (build1, linux6)
agent9: (build1, linux7)
agent10: (build1, linux8)

builders:
macos1: (build1, macos1)
macos2: (build1, macos2)
linux1: (build1, linux1)
linux2: (build1, linux2)
linux3: (build1, linux3)
linux4: (build1, linux4)
linux5: (build1, linux5)
linux6: (build1, linux6)
linux7: (build1, linux7)
linux8: (build1, linux8)

Thanks for the writeup about your use case. I think a combination of using agent tagging together with agent queues should be able to address this.

On the issue of not knowing which platform is needed at the time of upload, I think this can be achieved by dynamically building the pipeline and scripting the agent targets when defining the build steps. Something roughly like below

target=$(buildkite-agent meta-data get target_build)
 
cat <<YAML
steps:
- command: echo "Building at $target" 
  agents:
    queue: "default"
    os: $target
YAML

where target_build meta data will be set in a previous step where the target build gets known.

Well, I feel foolish! I think tags do do what I was hoping! Did something change? I remember reading the tagging docs many times and getting the impression they did not accomplish what I’d hoped.

Thank you!

That’s great to hear! :raised_hands:t2: Nothing changed around tagging but we did improve our docs :slight_smile:
Let us know if you have any other issue!

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.