I would love to have something like an “External” job type. It’d be a job type that on the UI would be represented similarly to a Command type, but the difference is that the job isn’t scheduled via agents.
Instead, the job can be accepted/updated/finished via GraphQL API.
This could allow for external operations, such as callbacks from an “eventually consistent system” like a Kube deploy, to influence the Buildkite pipeline, without having to “hog” an actual agent that would just sit there in a tight loop asking for work to be completed.
Think of a command step today that does something like:
trigger_external_operation()
while external_operation_not_completed():
wait
In these situations, some external system is already responsible for observing the task. It would be wasteful to have an agent just sit there in a loop and repeatedly ask “are you done yet”?
Additionally, there are situations where the Buildkite agents only have indirect access to the external system in question. Consider a use case where the Buildkite agent just pushes some Kubernetes Manifests to a gitops repo, and now has to wait for the GitOps Engine (something like Argo CD) to reconcile the change to a target cluster.
Buildkite agents may not actually have access to query Argo CD directly.
Having a step like this External Job Step could allow a system like Argo CD to asynchronously notify Buildkite of progress. It could “accept” the job once work starts, push arbitrary log output, and also finalize a job with an exit code.
I realize that this would potentially conflate the duties between the Agent API and the GraphQL API. Maybe this could be implemented on the Agent API instead of GraphQL?
And because I love mockups, here’s a mockup ;-)