Deploying Pixi environments with Linux containers
Deploying Pixi environments¶
We now know how to create Pixi workspaces that contain environments that can support CUDA enabled code. However, unless your production machine learning environment is a lab desktop with GPUs and lots of disk [1] that you can install Pixi on and run your code then we still need to be able to get our Pixi environments to our production machines.
There is one very straightforward solution:
- Version control your Pixi manifest and Pixi lock files with your analysis code with a version control system (e.g. Git).
- Clone your repository to the machine that you want to run on.
- Install Pixi onto that machine.
- Install the locked Pixi environment that you want to use.
- Execute your code in the installed environment.
That’s a nice and simple story, and it can work! However, in most realistic scenarios the worker compute nodes that are executing code share resource pools of storage and memory and are regulated to smaller allotments of both. CUDA binaries are relatively large files and amount of memory and storage to just unpack them can easily exceed a standard 2 GB memory limit on most high-throughput computing (HTC) facility worker nodes. This also requires direct access to the public internet, or for you to setup a S3 object store behind your compute facility’s firewall with all of your conda packages mirrored into it. In many scenarios, public internet access at HTC and high-performance computing (HPC) facilities is limited to only a select “allow list” of websites or it might be fully restricted for users.
Building Linux containers with Pixi environments¶
A more standard and robust way of distributing computing environments is the use of Linux container technology — like Docker or Apptainer.
Resources on Linux containers
Linux containers are a full topic unto themselves and we won’t cover them in this lesson. If you’re not familiar with Linux containers, here are introductory resources:
- Reproducible Computational Environments Using Containers: Introduction to Docker, a The Carpentries Incubator lesson
- Introduction to Docker and Podman by the High Energy Physics Software Foundation
- Reproducible computational environments using containers: Introduction to Apptainer, a The Carpentries Incubator lesson
Docker is a very common Linux container runtime technology and Linux container builder.
We can use docker build
to build a Linux container from a Dockerfile
instruction file.
Luckily, to install Pixi environments into Docker container images there is effectively only one Dockerfile
recipe that needs to be used, and then can be reused across projects.
FROM ghcr.io/prefix-dev/pixi:noble AS build
WORKDIR /app
COPY . .
ENV CONDA_OVERRIDE_CUDA=<cuda version>
RUN pixi install --locked --environment <environment>
RUN echo "#!/bin/bash" > /app/entrypoint.sh && \
pixi shell-hook --environment <environment> -s bash >> /app/entrypoint.sh && \
echo 'exec "$@"' >> /app/entrypoint.sh
FROM ghcr.io/prefix-dev/pixi:noble AS final
WORKDIR /app
COPY --from=build /app/.pixi/envs/<environment> /app/.pixi/envs/<environment>
COPY --from=build /app/pixi.toml /app/pixi.toml
COPY --from=build /app/pixi.lock /app/pixi.lock
# The ignore files are needed for 'pixi run' to work in the container
COPY --from=build /app/.pixi/.gitignore /app/.pixi/.gitignore
COPY --from=build /app/.pixi/.condapackageignore /app/.pixi/.condapackageignore
COPY --from=build --chmod=0755 /app/entrypoint.sh /app/entrypoint.sh
COPY ./app /app/src
EXPOSE <PORT>
ENTRYPOINT [ "/app/entrypoint.sh" ]
Dockerfile walkthrough
Let’s step through this to understand what’s happening.
Dockerfile
s (intentionally) look very shell script like, and so we can read most of it as if we were typing the commands directly into a shell (e.g. Bash).
- The
Dockerfile
assumes it is being built from a version control repository where any code that it will need to execute later exists under the repository’ssrc/
directory and the Pixi workspace’spixi.toml
manifest file andpixi.lock
lock file exist at the top level of the repository. - The entire repository contents are
COPY
ed from the container build context into the/app
directory of the container build.
WORKDIR /app
COPY . .
- It is not reasonable to expect that the container image build machine contains GPUs.
To have Pixi still be able to install an environment that uses CUDA when there is no virtual package set the
__cuda
override environment variableCONDA_OVERRIDE_CUDA
.
ENV CONDA_OVERRIDE_CUDA=<cuda version>
- The
Dockerfile
uses a multi-stage build where it first installs the target environment<environment>
and then creates anENTRYPOINT
script usingpixi shell-hook
to automatically activate the environment when the container image is run.
RUN pixi install --locked --environment <environment>
RUN echo "#!/bin/bash" > /app/entrypoint.sh && \
pixi shell-hook --environment <environment> -s bash >> /app/entrypoint.sh && \
echo 'exec "$@"' >> /app/entrypoint.sh
- The next stage of the build starts from a new container instance and then
COPY
s the installed environment and files from the build container image into the production container image. This can reduce the total size of the final container image if there were additional build tools that needed to get installed in the build phase that aren’t required for runtime in production.
FROM ghcr.io/prefix-dev/pixi:noble AS final
WORKDIR /app
COPY --from=build /app/.pixi/envs/<environment> /app/.pixi/envs/<environment>
COPY --from=build /app/pixi.toml /app/pixi.toml
COPY --from=build /app/pixi.lock /app/pixi.lock
# The ignore files are needed for 'pixi run' to work in the container
COPY --from=build /app/.pixi/.gitignore /app/.pixi/.gitignore
COPY --from=build /app/.pixi/.condapackageignore /app/.pixi/.condapackageignore
COPY --from=build --chmod=0755 /app/entrypoint.sh /app/entrypoint.sh
- Code that is specific to application purposes (e.g. environment diagnostics) from the repository is
COPY
ed into the final container image as well
COPY ./app /app/src
- Any ports that need to be exposed for i/o are exposed
EXPOSE <PORT>
- The
ENTRYPOINT
script is set for activation
ENTRYPOINT [ "/app/entrypoint.sh" ]
With this Dockerfile
the container image can then be built with docker build
.
Automation with GitHub Actions workflows¶
In the personal GitHub repository that we’ve been working in create a GitHub Actions workflow directory
mkdir -p .github/workflows
and then add the following workflow file as .github/workflows/docker.yaml
name: Docker Images
on:
push:
branches:
- main
tags:
- 'v*'
paths:
- 'cuda-exercise/pixi.toml'
- 'cuda-exercise/pixi.lock'
- 'cuda-exercise/Dockerfile'
- 'cuda-exercise/.dockerignore'
- 'cuda-exercise/app/**'
pull_request:
paths:
- 'cuda-exercise/pixi.toml'
- 'cuda-exercise/pixi.lock'
- 'cuda-exercise/Dockerfile'
- 'cuda-exercise/.dockerignore'
- 'cuda-exercise/app/**'
release:
types: [published]
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
permissions: {}
jobs:
docker:
name: Build and publish images
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Docker meta
id: meta
uses: docker/metadata-action@v5
with:
images: |
ghcr.io/${{ github.repository }}
# generate Docker tags based on the following events/attributes
tags: |
type=raw,value=noble-cuda-12.9
type=raw,value=latest
type=sha
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to GitHub Container Registry
if: github.event_name != 'pull_request'
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Test build
id: docker_build_test
uses: docker/build-push-action@v6
with:
context: cuda-exercise
file: cuda-exercise/Dockerfile
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
pull: true
- name: Deploy build
id: docker_build_deploy
uses: docker/build-push-action@v6
with:
context: cuda-exercise
file: cuda-exercise/Dockerfile
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
pull: true
push: ${{ github.event_name != 'pull_request' }}
This will build your Dockerfile in GitHub Actions CI into a linux/amd64
platform Docker container image and then deploy it to the GitHub Container Registry (ghcr
) associated with your repository.
When your container image has finished building, you can view it and its tags on GitHub at: https://github.com/<your GitHub username>/reproducible-ml-scipy-2025/pkgs/container/reproducible-ml-scipy-2025
Using the containerized environment on Brev¶
To verify that things are visible to other computers, install the Linux container utility crane
with pixi global
pixi global install crane
└── crane: 0.20.5 (installed)
└─ exposes: crane
and then use crane ls
to list all of the container images in your container registry for the particular image
crane ls ghcr.io/<your GitHub username>/reproducible-ml-scipy-2025
Our Brev environment already has Docker installed on it, so we can now pull down the Docker container using docker pull
.
docker pull ghcr.io/<your GitHub username>/reproducible-ml-scipy-2025:sha-<the commit sha short>
Once that is finished we can confirm that the container image exists in our local container registry with
docker images
We can then use docker run
to start and then attach a container instance with our software environment active inside of it
docker run \
--rm \
-ti \
--gpus all \
-v $PWD:/work \
ghcr.io/<your GitHub username>/reproducible-ml-scipy-2025:sha-<the commit sha short> bash
docker run
walkthrough
docker run
walkthroughLet’s step through the docker run
command to understand what’s happening.
--rm
: Automatically remove the container and its associated anonymous volumes when it exits-t, --tty
: Allocate a pseudo-TTY (for interactive use)-i, --interactive
: Keep STDIN open even if not attached (for interactive use)--gpus
: GPU devices to add to the container (all
to pass all GPUs)-v, --volume
: Bind mount a volume from<path on host machine>:<path inside Docker container>
Which is a valid and effective solution.