Buildkite clusters and wait/block step - unsupported?

I am using the experimental Buildkite Clusters feature, and it doesn’t seem to support a simple “wait” step.

This is the entirety of the pipeline I’m trying to run (I’ve defined it directly within the pipeline, but it also fails when run via an pipeline upload):

steps:
  - label: ':wave: Hello world 1'
    command:
      - echo "Hello world"
    agents:
      queue: default
  - label: ':wave: Hello world 2'
    command:
      - echo "Hello world"
    agents:
      queue: default
  - wait
  - label: ':wave: Goodbye world'
    command:
      - echo "Goodbye world"
    agents:
      queue: default

I have created agents listening to the default queue within the cluster, as well as have agent listening to the default queue outside the cluster (i.e. no cluster).

This pipeline runs fine when the pipeline is set to target “No cluster”. However, when I change it to point to my Buildkite cluster and run the pipeline, Buildkite fails it immediately, not even letting the job get queued up (i.e. it doesn’t show the run attempt in the list of builds). Instead, BuildKite’s UI shows the error No queue specified.

If I remove the - wait line, it runs fine within the Buildkite cluster (although obviously without the wait).

Leaving the wait line in, and changing the pipeline back to “No cluster” lets this step work immediately.

I’ve tried changing it to say

- wait:
  agents:
    queue: default

but this is rejected because agents is not a valid attribute for a wait step.

On a related note, I’ve found that the block step does not work either. Replacing the wait above with:

- block:
  blocked_state: running
  key: block-step
  if: build.env("SKIP_INPUT") != "TRUE"

also result in a No queue specified error.

Note that the queue name “default” doesn’t have anything to do with this; I discovered this issue when testing in a non-default named queue. It also does not have to do with being defined directly in the pipeline rather than in pipeline.yml; when I have BuildKite have the same pipeline uploaded via a build step, it runs the upload step but reports it as a failure, with the build output showing the same error No queue specified.

Do Buildkite clusters support the wait or block steps step? If not, when will it be supported? Is the only workaround through specifying step dependencies explicitly (with no workaround for block steps)?

Hello @mbarrien!

Hope you are well and welcome to the Buildkite Community Forum! :wave:

This looks like the case of when the Cluster’s default queue itself has the Default queue option disabled. When this vale is disabled, pipelines builds that start whom are assigned to said cluster with the default queue target will not be able to be scheduled (assuming that the cluster only has said default queue, and not others). Would you be able to check within that cluster if the default queue has this disabled?

image

To answer the question on whether both wait and block steps for pipelines that are as part of a cluster are supported - that is definitely the case!

There was no default queue enabled in this particular cluster.

I see what happened now, and this now turns into a feature request.

I created the cluster and queues via Terraform. When you create a cluster from scratch in terraform, it does not create the default queue (as opposed to when I create the cluster in the UI). I explicitly created the default queue via Terraform too, but creating it via Terraform does not set the flag to allow it to be the default.

The feature request is that Buildkite’s Terraform provider does not seem to have any way to set the default queue. I filed an issue on GitHub there at Add support for setting cluster default queue · Issue #377 · buildkite/terraform-provider-buildkite · GitHub (and unfortunately, this makes it difficult for me to automate this completely.)

Compounding this issue is that when you don’t have a default queue, there is no indicator in the UI that the concept of a default queue for a cluster even exists. All the documentation only talks about the name “default” but never this checkbox. On top of that, this checkbox is hidden 4 clicks deep (Click on cluster, click on Queues, click on queue name, click on edit), which made this impossible to discover if you didn’t know this concept existed! I’d love for this to be at the cluster settings level rather than in the queue setting.

Thanks @mbarrien - all makes sense there!

My team (in a great stroke of fortune) manages Buildkite’s Terraform Provider - and I actually wrote the integration for Cluster Queues - so this is very well known!

Right now, the Cluster Queue resource doesn’t have the ability to set default queues as thats needed in the respective Cluster queue GraphQL mutation. Happy for you to file in a Issue there as me and the team will be able to triage and work towards getting this investigated and the work that will be involved in a potential solution.

Cheers :slightly_smiling_face:

This topic was automatically closed after 3 days. New replies are no longer allowed.