Please can you implement build retries on other agents to handle when the BuildKite agent goes away due to the agent or machine being shut down or machine put to sleep.
In distributed processing systems it is common to retry a task 4 times on 4 different hosts before declaring a task as actually failed, because a lot of the time those failures are due to temporary issues or machines dying or in my case being shut down or put to sleep because I run BuildKite agents on my laptop.
This will also make BuildKite more suitable for use on Kubernetes where pods get evicted or on Cloud where preemptible instances can be killed with short notice, not enough time to wait for builds to finish cleanly and which will also result in false negative build failures and red failed badges on projects that shouldn’t happen but currently does (hence how I found out to raise this ticket).