Uploaded step in waiting_failed state

Hi everyone,

One of my pipeline steps uploads a pipeline with a single step to send a notification in slack in case the step fails (see Setting up notifications within a pipeline step - #10 by john.soo).

The strange part is that the notify step sometimes ends up in waiting_failed state while it succeeds in other builds… Any idea what could cause this?

Hey @bunert :waving_hand:

By any chance, do you have any depends_on defined that are not met? Would you mind sharing your pipeline YAML (with any fragile information masked)?

hey @lukasz-buildkite

We don’t have any explicit depends_on defined. The initial pipile contains a step running a script:

steps:
   - label: "..."
     if: build.branch == "..."
    agents: ...
    env: ...
    command:
        - some initial commands...
        - ./script.sh
        - other commands...

Where the script.sh checks on a return code and if the return code is not 0 we run:

notify_failure() {
    buildkite-agent pipeline upload <<YAML
steps:
  - label: "..."
    command: "exit 1"
    notify:
      - slack:
          channels:
            - "#channel"
          message: "slack message..."
YAML
}

What seems strange to me is that sometimes it works and someties it doesn’t… What I noticed is that the build where it worked has the ENV BUILDKITE_SOURCE="webhook" while the build where it was stuck has BUILDKITE_SOURCE="schedule".

Hey @bunert :waving_hand:

Thanks for providing more details regarding the pipeline. Interesting!
My hunch would be that the branch conditional is not met in all cases.

We’d recommend reaching out to us via email on support@buildkite.com and provide link to the problematic pipeline, so we can examine it in more details.

Thanks!