Skip to article frontmatterSkip to article content

Reproducible Machine Learning Workflows for Scientists with Pixi

Authors
Affiliations
University of Wisconsin-Madison
Prefix.dev
NVIDIA

Abstract

Scientific researchers need reproducible software environments for complex applications that can run across heterogeneous computing platforms. Modern open source tools, like Pixi, provide automatic reproducibility solutions for all dependencies while providing a high level interface well suited for researchers.

This tutorial will provide a practical introduction to using Pixi to easily create scientific and AI/ML environments that benefit from hardware acceleration, across multiple machines and platforms. The focus will be on applications using the PyTorch and JAX Python machine learning libraries with CUDA enabled, as well as deploying these environments to production settings in Linux container images.

Keywords:reproduciblemachine learningpixipythonscipy

Taught at SciPy 2025 as a tutorial on Monday July 7th, 2025

SciPy Logistical Information

Rough Outline

0:00 – 0:20 (20 min):

0:20 – 0:40 (20 min):

0:40 – 1:00 (20 min):

1:00 – 1:15 (15 min):

1:15 – 1:30 (15 min):

1:30 – 2:50 (20 min):

1:50 – 2:20 (30 min):

2:20 – 2:35 (15 min):

2:35 – 2:55 (20 min):

2:55 – 3:15 (20 min):

3:15 – 3:25 (10 min):

3:25 – 3:45 (20 min):

3:45 – 4:00 (15 min):

Acknowledgments

This tutorial was supported by the US Research Software Sustainability Institute (URSSI) via grant G-2022-19347 from the Sloan Foundation, Prefix.dev, and NVIDIA.