Docker-compose cache-from directive not working with v6 of the AWS stack

Hi.

I previously raised this issue on your Slack channel a couple of months ago where I think it was assigned to an engineer, but it still doesn’t seem to be fixed so I’ll raise it on these forums in case it has been forgotten or lost.

We have a Buildkite step like so that works fine on version 5 of the AWS Elastic Stack:

  - label: ':docker: Docker build :php:'
    agents:
      docker: 'true'
      queue: build
    key: build_docker_php_image
    plugins:
      - artifacts#v1.9.0:
          download: "public/admin/config/mandrill/*.html"
      - docker-compose#v4.14.0:
          build: php
          cache-from: $ECR_REPO_URL:php-$BUILDKITE_BRANCH
          image-name: php-$DOCKER_BUILD_TAG
          image-repository: $ECR_REPO_URL
      - ecr#v2.7.0:
          login: true
    <<: *automatic_retry

Where we have the following also defined in the pipeline for the variables that are referenced:

env:
  DOCKER_BUILD_TAG: $BUILDKITE_BRANCH-$BUILDKITE_BUILD_NUMBER
  ECR_REPO_URL: 759931498410.dkr.ecr.ap-southeast-2.amazonaws.com/$BUILDKITE_PIPELINE_SLUG

When switching to v6 (now on v6.7.1) the pipeline fails with the following error:

Running plugin docker-compose command hook
$ /var/lib/buildkite-agent/plugins/github-com-buildkite-plugins-docker-compose-buildkite-plugin-v4-14-0/hooks/command
/var/lib/buildkite-agent/plugins/github-com-buildkite-plugins-docker-compose-buildkite-plugin-v4-14-0/hooks/../commands/build.sh: line 79: cache_from__759931498410_dkr_ecr_ap_southeast_2_amazonaws_com/megatron: invalid variable name
🚨 Error: The command exited with status 1

The Buildkite agent version is unchanged in both stacks and is v3.55.0

I have also tried by specifying the cli-version: 2 option for the docker-compose plugin with no change to the result.

If I remove the cache-from directive then the step passes (but without the benefit of caching to speed things up).

Hey Jim!

Welcome to the community!
Sorry for the delay :pray:t2:

Thanks for the detailed explanation of the problem, it helped me debug and duplicate the issue. There are a couple of things going on here:

  1. The syntax of the cache-from property is: app:image-repo:tag:grouping, for example: app:index.docker.io/myorg/myrepo/myapp:latest like the documentation mentions GitHub - buildkite-plugins/docker-compose-buildkite-plugin: 🐳⚡️ Run build scripts, and build + push images, w/ Docker Compose.
    So in this case, you are not using the property correctly, it should be: php:$ECR_REPO_URL:php-$BUILDKITE_BRANCH

  2. The Elastic Stack v6 uses bash v5.2.15, and Elastic Stack v5 uses bash v4.2.46. The latter was more permissive and didn’t throw an error when using invalid characters on indirect expansion, so that’s why your builds when using stack v5 didn’t fail.

In your case, because you are not passing the correct format to the cache_from property, the invalid variable name comes from the invalid character / in the repository URL that is misinterpreted as the service portion of the option value.

The bug here is with bash not failing when it should. It also happens that the option validation is complicated because its syntax has evolved with time (it can have anywhere from 2 to 4 fields). We should probably be more explicit in the documentation or do some further validation with it.

If you update the cache-from property and pass the service, it should work as expected :slight_smile:

Awesome. This was indeed the problem. Thanks for your help. This can be marked as resolved.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.