The more precise dependencies of the DAG is awesome. We’ve been excited to try it. However we’ve found what seems to be a bug.
Minimal Example
Given
dag: true
steps:
- label: "test it"
command: "true"
depends_on:
- "not-known"
A build using that will sit in running forever, even though there’s no running steps at all, and so no chance of any further progress being made:
Background
We have a dynamic trigger step, where the configuration used to trigger it is dynamic, and so we’re having to do a pipeline-upload of the trigger step. We still want to wait for the trigger step to be successful, meaning, with the DAG, we need to depend on it (we set the same ID for the step we upload dynamically, no matter what). However, the upload step itself is also dependent on earlier testing, so it may not run and so may not create the step upon which the later step depends.
Simplified significantly, it looks something like:
dag: true
steps:
- id: "run-tests"
command: "true"
- id: "upload-step"
command: >-
echo '{"steps":[{"id":"dynamic-step","command":"true"}]}' | buildkite-agent pipeline upload
depends_on:
- "run-tests"
- id: "after-trigger"
command: "true"
depends_on:
- "dynamic-step"
If run-tests
fails (e.g. command: "false"
), upload-trigger
never runs, and so after-trigger
is depending on an ID that doesn’t exist, and the build sits in Running with run-tests
failed, upload-trigger
not run, dynamic-step
not existing and after-trigger
waiting, forever.
Attempted workaround
A workaround I tried was to have the after-trigger
step also depend on an additional earlier step, e.g.
- id: "after-trigger"
command: "true"
depends_on:
- "run-tests"
- "dynamic-step"
or
- id: "after-trigger"
command: "true"
depends_on:
- "upload-step"
- "dynamic-step"
Or both.
The thinking was that if some of its dependencies fail (or don’t otherwise don’t run), then the pipeline will register that after-trigger
can never run and so fail the whole build, but it seems the unknown dependency still “wins”.
For instance, for the second adjustment above, the build still sticks at running: