Agent maintenance: target every agent matching tag?

Hi, I’m using a scheduled job to do periodic maintenance on agents such as deleting downloaded dependencies—the sort of cleanup where it would be slow/wasteful to recreate it from scratch every single build but it becomes a problem if it churns and grows indefinitely.

Previously this cleanup was implemented as a cronjob local to each agent but if it ran at the same time as a build it would cause spurious failures. The current solution is a cleanup pipeline on its own branch that names every single agent specifically, with OS-specific cleanup steps against it. Each agent will finish what it’s doing, run the cleanup job, then be ready for the next job.

This cleanup pipeline is a maintenance headache as agents come and go. Is there any way I can use Buildkite’s tools to make every online agent matching a particular tag run a set of steps? Or otherwise solve this problem differently? Thanks!

Hi @thombles!

Welcome to the community! :blush:

That’s a tricky question! It’s not possible at the moment to achieve exactly that, but it is something that was discussed internally with the team.
Alternative, what other users do is one or a combination of the following:

  • schedule a build that runs and targets agents i.e they could schedule a build that runs every 6/12/24 hours to clean up all resources: Scheduled Builds | Buildkite Documentation

  • create a step at the end of the pipeline that runs at the end of every build even if the other step fails (except if the user initiates a cancellation), like:

    steps:
      - command: exit 1
        key: "a"
     
      - command: echo "do this thing first"
        key: "b"
        depends_on: "a"
    
      - command: echo "clean up the things"
        label: "clean up task"
        depends_on: "b"
        allow_dependency_failure: true
    
  • Use a pre-ext hook (but this runs on every step), agent exit hook (but that only runs on agent shut down): Buildkite Agent Hooks v3 | Buildkite Documentation

  • Some tricky combination of graphql, hostname tags (which we discourage :sweat_smile:) and a dynamically uploaded pipeline.

Hope this helps!

Cheers