Is buildkite-agent intended to be used on preemptible instances?

I am running buildkite-agent inside a google kubernetes cluster on preemptible node pool. When node goes down (or when I simulate this situation by just killing pod) agent stays in buildkite agents lists for a very long time. I know that there is a health-check-addr option but it’s not an option for agents running inside private clusters.
Is there any way to set some kind of timeout which sould be tracked on buildkite API side?

I suppose perfect solution would be an option like eviction-id which tells API that if there is a new agent spawned, the old one with the same eviction-id should be stopped by buildkite API.

Hi Andrey,

I am not familiar with this but I think it is ok to use on Preemptible instances. There is similar work that adds scripting around the life cycle of Agent. Example of such scripting work from our Elastic CI Stack for AWS:
https://github.com/buildkite/elastic-ci-stack-for-aws/pull/737

I found Google has an article on terminating with grace
​https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-terminating-with-grace

But I cannot see a way to send double SIGTERM to the agent, but maybe could implement a preStop hook that sends the first SIGTERM then sleeps for 20 seconds. Once that is done, Kubernetes will send the second SIGTERM which will force the job to stop and the agent will be able to deregister.

I hope this helps!

Cheers,
Juanito​​​​​​

Is this a suggestion or this option exists to fix this issue ?

Sorry would need to implement scripting similar to https://github.com/buildkite/elastic-ci-stack-for-aws/pull/737 for now on GCP.