Different parts of our organisation need to be restricted to different permissions within our AWS accounts. Currently those permissions are granted to BK agents, via the instance profiles of the EC2 on which they run.
The problem is there appears to be no way to restrict usage of agents within a BK organisation.
To achieve hard separation (e.g. between “admin” and “non-admin” AWS activity) we have resorted to creating two BK organisations each with their own agent pool (the agent EC2s having been configured with suitable IAM profiles).
That does not solve all our problems though - non-admins still need to be able to monitor admin builds, so need to be in the admin org. Even if we set those users to be Read Only for the existing pipelines, merely by being in the admin org it allows them to create new pipelines, which grants them access to the admin agents…something we cannot allow.
The missing BK feature seems to be either:
a way to stop users from being able to create new pipelines (not just be Read Only on existing)
a way to restrict access to agent queues
The latter is probably the more powerful, but even the first solution would unblock us and might be easier to implement.
Thanks for the write up @jukleja-tyro, this is something that we’ve been thinking about a lot and have big plans for.
Our current thinking on approach is that we’re going to introduce the concept of “Agent Clusters”. An Agent Cluster will be able to have Agents with unique queues and tags, and be associated with specific Pipelines and Teams that have access to it.
Organizations will be able to have multiple Agent Clusters and we’re planning to have these form the hard boundaries that it sounds like you are looking for!
We’ve still got a fair bit to work out, but we are planning to also include a move towards public/private keypairs for registering agents vs Agent Registration tokens. This would mean you only ever share your Agent’s Public Keys with Buildkite and we never possess the private key which opens lots of future doors for things like Secrets.
We don’t presently publish a roadmap, we might consider it in future! I’d love to give clearer ideas of timeframes, but it depends on a lot of variables and a small team of folks to do the work. It’s on the cards for this year, as part of our #1 priority, Public Builds.
Hey there, I have an identical use case in mind for this.
We have typical build agents that run on ec2 instances with one instance profile, but are looking to roll out new “ops” nodes that have extra AWS permissions we don’t want to allow typical developers to be able to use, like running Terraform or Packer.
My initial plan is to have a separate agent queue and an agent environment hook that checks $BUILDKITE_CREATOR_TEAMS and restrict access that way. However, I don’t think this is a viable option since it doesn’t get set correctly in builds triggered through a GitHub webhook rather than through the Buildkite UI.
Here’s the template I’d like to be able to use for my hook:
if [ ":$BUILDKITE_BUILD_CREATOR_TEAMS:" != "*:<%= @authorized_team %>:*" ]; then
echo "You are not a member of <%= @authorized_team %>! You are not allowed to use the <%= @queue %> queue."
exit 1
fi
Alternatively, I think this could work if there’s a way to query the teams the current pipeline is associated with.
… and it produced the teams that are associated with the pipeline. That said, we are still working on the Agent Clusters feature that Lox mentioned above and will hopefully have it ready for testing in a couple of months.
Thanks for the tip! I’m glad there’s a way to get the teams now, but is there any chance this could be included in the REST API, too? Having a simple array of team names would be much easier to handle in a quick bash hook than having to parse out a GraphQL response.
Using GraphQL does add a little extra complexity though. Since a buildkite agent access token doesn’t have access to GraphQL, we’ll need to pass in an additional one safely. It also doesn’t seem ideal that the token will have full access to my organization.
It would be much more convenient to have a $BUILDKITE_PIPELINE_TEAMS already available or be able to hit the REST API with the buildkite access token.
Seems like there are new objects named Cluster and ClusterQueue in the GraphQL API now. Any ETA the roll out date of those features? The “Agent Clusters” idea sounds awesome and should help restrict access to agent queues.
I asked about limiting jobs an agent can handle in Limit jobs an agent can handle?. Will a single pipeline be able to schedule to different clusters? I am really unclear what it means to “restrict access to agents” here.
I believe the current idea is that a single pipeline will be associated to a cluster and can only run there. But this feature is still a work in progress and subject to change.
We’ll share more about it when it’s more nailed down exactly how it will work