Ways of DRYing up pipeline YAML

(Continuing slack thread)

We have many repositories, each with 0-many pipelines spread across various YAML files.

Within one pipeline file, we lean on YAML’s named-anchors feature to make the steps array contain less repetition. For (simple) example:

anchors:
  linux_builders: &linux_builders
    agents:
      - "agent_count=8"
      - "boot_disk_size_gb=400"
      - "capable_of_building=platform"
      - "environment=${CI_ENVIRONMENT:-production}"
      - "experiment_normalised_upload_paths=true"
      - "machine_type=quad"
      - "node_stability=${CI_NODE_STABILITY:-interruptible}"
      - "permission_set=builder"
      - "platform=linux"
      - "queue=${CI_LINUX_BUILDER_QUEUE:-v4-20-08-10-093823-bk14145-93d4c873-d}"
      - "scaler_version=2"
      - "working_hours_time_zone=none"
  windows_builders: &windows_builders
    agents:
      ...
  auto_retry_reasons:
    bk_system_error: &bk_system_error
        exit_status: -1
        limit: 3
    ...
    standard_auto_retries: &standard_auto_retries
      - <<: *bk_system_error

steps:
  - label: build foo
    <<: [*linux_builders, *standard_auto_retries]
    command: echo hi there

  - label: build bar
    <<: [*linux_builders, *standard_auto_retries]
    command: echo hi there

  ... tens of steps ...

(Our pipelines are more complex than the above)

This works alright within a pipeline, but

  • suffers because some parts of the pipeline are defined by arrays, which are harder to work with in YAML because one cannot patch an array like one can a dictionary
  • doesn’t allow us to DRY up several pipelines within the same repository that now each contain the same boilerplate
  • don’t allow us to pull in the boilerplate to dynamically generated pipelines very easily

This means engineers will cargo-cult more, and understand less. We’ve definitely seen that happen.

It would be great if BK itself (agent and CLI) would support something that would allow these problems to be solved.

We can clearly work around them inside user-space by templating or composition, but that means now the native tools can’t just work with our pipeline definitions, and we’re maintaining a translation layer.

Personally, I enjoy the kustomize.io approach of hierarchical patching because it retains the strong-typing compared to a more brutal templating approach, but really, I wanted to see whether BK might welcome contributions here or already had something in the works, or what. Obviously, future-proofing and backwards compatibility is a key pair of concerns here, and so I thought I’d ask.

I’d love some form of overlaying approach between triggering/triggered pipelines. Maybe not for the currently existing portions since debugging rendering issues might be a pain…

…but a special top-level “includes” node that overloads the current config over the triggering config (and-so-on) as an area to explicitly carry “general” configuration and context into downstream pipelines would be awesome!

The patching tool from kustomize.io looks super cool :sunglasses:

This is a neat idea and it’s generated deeper discussion internally around organization level reuse/sharing of pipeline configs, but it’s something that will require more research and investigation at this stage.

Thanks for the feedback!