Is there a way to know whether a run is a retry?

Right now, my company is splitting up our test suite (Rspec + Cypress) across a bunch of different runners, so each one runs say 10 tests. However, sometimes we hit flakes, where one test will fail. Right now, we have some automatic retry logic that will re-run the tests in the runner. However, it would be great if we could retry only the tests that failed.

From looking around, it doesn’t seem like Buildkite supports this out of the box, and so I’m exploring how I could potentially do this with code. I see that with the List Builds for a Pipeline API, we do have access to information about whether a job was a retry, and from which job this was (see below for an example from one of our builds):

"retries_count": 1,
"retry_source": {
   "job_id": "018ecacc-253b-4901-b63a-6cae18db9aed",
   "retry_type": "automatic"
},

So, my question is: when we’re within a file that’s being run as a Command step, is there a way for us to know/pass in information about a) whether this is a retry, and most importantly, b) what the retry_source is? If we had this information, I could use the Buildkite API to grab the source job’s logs, parse which tests failed, and then run just those tests.

Is this possible right now?

I guess a related question would be: do we have any control over how automatic retries work at all? As in, is the Command step re-run when a job is retried, giving us access to potentially change things, or is the job input cached and re-run without actually stepping into the Command step script?

Welcome to the Buildkite Forum @Isaac !

a) Yes, within a Command step, you can determine if the current job is a retry by checking the BUILDKITE_RETRY_COUNT environment variable, which Buildkite sets automatically.

b) The retry_source isn’t directly exposed as an environment variable. However, you can use the Buildkite API to fetch the current job’s details within your script to retrieve the retry_source. You’d need to script this logic to parse the source job ID and then use it to fetch the corresponding job’s log for further analysis.

To summarize, while Buildkite doesn’t provide a built-in method for identifying retry_source within the environment, it is possible to script this functionality using the API to achieve the desired behaviour.

Hope this information clears up any doubts you have!