Jim
July 19, 2024, 6:49am
1
Hi.
I upgraded our Buildkite agent from 3.74.1 to 3.75.0, and it produced Docker images for our apps that caused an outage because the container runs as an unprivileged user, while the files in the image were owned by root.
Previously the permissions on the files were 0644 but on the new agent they are 0600.
e.g.
Run on an old image produced by 3.74.1:
www-data@483f8a0bcd01:~/html$ ls -l public/admin/config/mandrill/offplatform-confirmation.html
-rw-r--r-- 1 root root 41291 Jul 19 05:03 public/admin/config/mandrill/offplatform-confirmation.html
New image produced by 3.75.0
www-data@27cc42d4220f:~/html$ ls -l public/admin/config/mandrill/offplatform-confirmation.html
-rw------- 1 root root 41291 Jul 19 05:49 public/admin/config/mandrill/offplatform-confirmation.html
We’ll fix our pipelines up, but mentioning this as it may catch out others.
1 Like
Hi @Jim ,
Thanks for bringing this to our attention. Are you able to send out a build url to support@buildkite.com so we can investigate further?
Cheers!
Jim
July 19, 2024, 7:40am
4
For a bit more info here.
We had a step producing artifacts like so:
- label: 'Upload mjml html to artifacts'
agents:
docker: 'true'
queue: build
artifact_paths:
- "public/admin/config/mandrill/*.html"
command: .buildkite/mjml-to-html.sh
key: upload_mjml_html
plugins:
- docker-compose#v4.16.0:
cli-version: 2
collapse-run-log-group: true
dependencies: false
run: node
- ecr#v2.7.0:
login: true
And then downloading them here:
- label: ':docker: Docker build :php:'
agents:
docker: 'true'
queue: build
key: build_docker_php_image
plugins:
- artifacts#v1.9.2:
download: "public/admin/config/mandrill/*.html"
- docker-compose#v4.16.0:
cli-version: 2
push:
- nginx_php_fpm:$ECR_REPO_URL:$BUILDKITE_BRANCH-nginx-php-fpm
- nginx_php_fpm:$ECR_REPO_URL:$DOCKER_BUILD_TAG-nginx-php-fpm
- php_cli:$ECR_REPO_URL:$BUILDKITE_BRANCH-php-cli
- php_cli:$ECR_REPO_URL:$DOCKER_BUILD_TAG-php-cli
- ecr#v2.7.0:
login: true
The Dockerfile
used by the build step uses COPY . /var/www/html/
as one of its steps.
Jim
July 19, 2024, 7:57am
5
I guess it was the change here:
buildkite:main
← buildkite:artifact-integrity
opened 07:19AM - 08 Jul 24 UTC
### Description
While investigating the artifact download process a bit more,… I discovered some fairly low-hanging fruit in the existing downloader.
### Context
Fixes #2774 by making one or another artifact the "winner", and printing a SHA256 sum of the content that it downloaded.
https://coda.io/d/_dHnUHNps1YO#View-3-of-Escalations-Table_tu3KF/r437&view=modal
### Changes
- Write to a temp file and move it to the destination name when ready
- Check the error return from Close
- Print a SHA256 sum of the content
- Add ability to verify SHA256 checksum (not yet used)
- Make error strings more Gothic
### Testing
- [x] Tests have run locally (with `go test ./...`). Buildkite employees may check this if the pipeline has run automatically.
- [x] Code is formatted (with `go fmt ./...`)
It used to use os.Create()
to create the file which uses the current UMASK on the system combined with 0666 to determine file permissions; but the new version uses os.CreateTemp()
which will lock permissions down on the temporary file and path so that only the user can read it, hence the loss of group and world permission bits on the final file.
Hello, @Jim ! Thank you for pointing out the change that could be the cause if this issue. We will take a closer look there.
Jim
July 22, 2024, 1:19am
7
I see that you have now merged a fix that will be in a future release of the agent.
Thanks.
buildkite:main
← buildkite:artifact-integrity
opened 12:08AM - 22 Jul 24 UTC
### Description
The change from `os.Create` to `os.CreateTemp` changed the fi… le permissions of the created file (from 0o666 to 0o600). This was unintentional.
### Context
#2878
### Testing
- [x] Tests have run locally (with `go test ./...`). Buildkite employees may check this if the pipeline has run automatically.
- [x] Code is formatted (with `go fmt ./...`)
1 Like