Skip to article frontmatterSkip to article content

General Hardware Acceleration with CUDA

Authors
Affiliations
University of Wisconsin-Madison
prefix.dev GmbH
NVIDIA

So far we’ve focused on machine learning examples, but CUDA hardware accelerated workflows extend far beyond AI/ML.

CuPy Example

Perhaps one of the most well known CUDA accelerated array programming libraries in CuPy, which is designed to have APIs that are highly compatible with NumPy and SciPy so that people can think in the common Scientific Python idioms while still leveraging CUDA.

Constructing the workspace

CuPy is distributed on PyPI and on conda-forge, so we can create a Pixi workspace that supports its CUDA requirements and then adds CuPy as well.

which gives us access to CuPy’s hardware acceleration, as shown in this example from the CuPy documentation

cupy-example.py
import numpy as np
import cupy as cp

# Array APIs are the same though operating on different hardware devices
x_cpu = np.array([1, 2, 3])
x_gpu = cp.array([1, 2, 3])

# Compute norms for both arrays
l2_cpu = np.linalg.norm(x_cpu)
l2_gpu = cp.linalg.norm(x_gpu)

print(f"NumPy array norm {l2_cpu} on device: {x_cpu.device}")
print(f"CuPy array norm {l2_gpu} on device: {x_gpu.device}")
pixi run python cupy-example.py
NumPy array 3.7416573867739413 on device: cpu
CuPy array 3.7416573867739413 on device: <CUDA Device 0>

CuDF Example

There are other CUDA accelerated libraries for scientific Python as well. NVIDIA has created the RAPIDS data science collection of libraries for running end-to-end data science pipelines fully on GPUs with CUDA. One of the libraries is CuDF — a high level Python library for manipulating DataFrames on the GPU with Pandas-like idioms.

Constructing the workspace

CuDF is not available on conda-forge, but it is available on the Python Package Index (PyPI) as cudf-cu12 and on the rapidsai conda channel on Anaconda.org as cudf. We can install it through either method, but to keep working with conda package, we’ll create a workspace that installs it from the rapdsai conda channel.

From this code snippet from a user guide from NVIDIA, we can now see that CuDF has very similar semantics and API to Pandas

cudf-example.py
import pandas as pd
import cudf

# 1M Wikipedia pageview counts
data_url = "https://raw.githubusercontent.com/NVIDIA/accelerated-computing-hub/2186298825b85ef38f08e779af7992b8d762289f/gpu-python-tutorial/data/pageviews_small.csv"

# The semantics we know from Pandas
df_cpu = pd.read_csv(data_url, sep=" ")
print(f"Pandas DataFrame:\n {df_cpu.head()}")

# also exist with CuDF
df_gpu = cudf.read_csv(data_url, sep=" ")
print(f"\nCuDF DataFrame:\n {df_gpu.head()}")

# Label columns & drop unused column
df_gpu.columns = ["project", "page", "requests", "x"]
df_gpu = df_gpu.drop("x", axis=1)

# Count number of English pages
print(f"\n# of English:\n {df_gpu[df_gpu.project == 'en'].count()}")
pixi run python cudf-example.py
Pandas DataFrame
:    en.m                   Article_51  1  0
0    ja                       エレファモン  1  0
1   ang                Flocc:Scīrung  1  0
2    en  Panorama_(La_Dispute_album)  1  0
3  fa.m                  جاشوا_جکسون  1  0
4  fa.m                 خانواده_کندی  2  0
[1297577][09:12:17:950636][warning] Auto detection of compression type is supported only for file type buffers. For other buffer types, AUTO compression type assumes uncompressed input.

CuDF DataFrame
:    en.m                   Article_51  1  0
0    ja                       エレファモン  1  0
1   ang                Flocc:Scīrung  1  0
2    en  Panorama_(La_Dispute_album)  1  0
3  fa.m                  جاشوا_جکسون  1  0
4  fa.m                 خانواده_کندی  2  0

For time today, we won’t cover CuDF fully, but there are user guides for how to use CuDF, as seen below.