Restart step in a different agent

Hello Buildkite team!

Just a feature request:

As a software developer, checking the CI results of the branch I’m working, while not necessarily having access to more advanced/admin configurations, I would like to be able to restart a job for a step in a pipeline in a different agent than the one it just ran.

This is because, sometimes, the failure is related to the current state of the machine in which the tests ran, so it’s specific to that machine’s agent. An example could be having a full disk, not enough memory, etc.

When I restart a job, it always restarts in the same agent that it just ran.

The other option I have is to restart the full build, for the full pipeline. But that would take a lot of time, as some of the steps take several minutes, and the specific step that I currently care about could be run on the same agent/machine that is having problems.

We have a similar need, but for different reasons. In our case, we have some jobs that deploy resources to machines in China.

The Great Firewall, however, sometimes decides that the dynamic IP address assigned to our cloud build agent isn’t allowed to talk to China. Thus, retries on this machine will never work, but running the same job from a different machine would.

Right now, the best answer we have is to shut down the cloud agent that’s blacklisted and start up a new instance that gets a new IP, and just hope that one isn’t also blacklisted. This usually works, but is obviously far from ideal.