Provide termination reason to agent-shutdown hook

solemnwarning · May 14, 2025, 1:09am

Hello,

I have a cluster of agents on physical machines and VMs which are started on demand, I use the disconnect-after-idle-timeout option to stop the agent process when there is no work and OS-specific methods of varying hackyness to shut the machine down after that.

It would be nice if the agent process termination reason (e.g. signal, timeout, something else…) was provided to the agent-shutdown hook, then I could simply have a hook which schedules a shutdown of the host system when the agent exits due to the idle timeout expiring, and manually stopping the agent (e.g. for maintenance) would skip the system shutdown.

Thanks

Priya · May 14, 2025, 3:58am

Hey @solemnwarning

Welcome to the Buildkite community!

Could you share how you’re currently running your agents—are they part of a self-managed fleet, or are you using our Elastic CI Stack on AWS?

At the moment, the shutdown reason isn’t available to pass into the agent-shutdown hook. However, the agent logs do emit the message "All agents have been idle for N seconds. Disconnecting" before shutting down. Would it be possible for you to monitor this log line and trigger a scheduled shutdown of the host accordingly?

Just to clarify, the disconnect-after-idle-timeout setting only takes effect once all agents on a host are idle and ready to disconnect.

Let us know what you think!

Cheers,
Priya

solemnwarning · May 14, 2025, 10:08pm

Hi Priya,

The agents are part of a self-hosted cluster, I could write something to monitor the log for that specific message, but that falls under the “varying hackyness” bucket I’ve already got

I’ve prototyped a change to the agent which adds some details of the shutdown reason when running the agent-shutdown hook, hopefully it can be developed into a proper feature: [PROTOTYPE] Inform agent-shutdown hook of reason for agent shutdown. by solemnwarning · Pull Request #3315 · buildkite/agent · GitHub

Priya · May 14, 2025, 11:00pm

Hey @solemnwarning

Appreciate your PR! We will bring this across our team and will follow further discussion on the PR!

Cheers,
Priya

Topic		Replies	Views
Buildkite-agent command to signal it should stop after this job Features Requests	2	821	February 18, 2022
Idle Buildkite Agent is trying to terminate instance Elastic CI Stack for AWS	2	359	September 7, 2023
Autoscaling disconnects active agents Elastic CI Stack for AWS	11	387	August 1, 2023
Is buildkite-agent intended to be used on preemptible instances? General	7	1620	December 25, 2020
Automatically retry failed steps on AGENT_STOP Features Requests	1	770	February 5, 2021

Provide termination reason to agent-shutdown hook

Related topics