Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Reproducible Machine Learning Workflows for Scientists with Pixi

Authors
Affiliations
University of Wisconsin-Madison
prefix.dev GmbH
NVIDIA

Abstract

Scientific researchers need reproducible software environments for complex applications that can run across heterogeneous computing platforms. Modern open source tools, like Pixi, provide automatic reproducibility solutions for all dependencies while providing a high level interface well suited for researchers.

This tutorial will provide a practical introduction to using Pixi to easily create scientific and AI/ML environments that benefit from hardware acceleration, across multiple machines and platforms. The focus will be on applications using the PyTorch and JAX Python machine learning libraries with CUDA enabled, as well as deploying these environments to production settings in Linux container images.

Keywords:reproduciblemachine learningpixipythonscipy

Taught at SciPy 2025 as a tutorial on Monday July 7th, 2025

DOI

Tutorial recording on YouTube

SciPy Logistical Information

Rough Outline

00:00 – 00:05 (5 min):

00:05 – 00:15 (10 min):

00:15 – 00:30 (15 min):

00:30 – 01:00 (30 min):

01:00 – 01:40 (40 min):

01:40 – 01:55 (15 min):

01:55 – 02:35 (40 min):

02:35 – 02:45 (10 min):

02:45 – 03:10 (25 min):

03:10 – 03:30 (20 min):

03:30 – 04:00 (30 min):

This tutorial was supported by the US Research Software Sustainability Institute (URSSI) via grant G-2022-19347 from the Sloan Foundation, prefix.dev GmbH, NVIDIA, and the University of Wisconsin–Madison Data Science Institute.

References
  1. Matthew Feickert, Ruben Arts, & John Kirkham. (2025). matthewfeickert-talks/reproducible-ml-for-scientists-with-pixi-scipy-2025: SciPy 2025 Tutorial. Zenodo. 10.5281/ZENODO.16320203