Updates

2024

The emphasis of the 2024 class will be on cutting-edge AI hardware for inference. Here, the deployment of ML models on recent mobile, desktop and server systems will take center stage. In detail, we will discuss the SM8550P System On Chip which is available to students as part of Qualcomm Innovators Development Kits. Depending on availability throughout the semester, we will also cover AMD’s Ryzen AI, NVIDIA’s Grace CPU Superchip and L40S, and tenstorrent’s Grayskull e75. Additionally, the 2024 class will introduce inference of Meta’s Llama 2 family of large language models as an example application.

2023

The 2023 class will integrate some of the latest features coming with PyTorch 2. Especially the new software ecosystem behind torch.compile is a key development when targeting efficient machine learning. Further, the class’s scope will be extended by also discussing inference on mobile devices. Obtaining high performance on mobile devices typically requires a technique called quantization. The targeted system on chip will be the Snapdragon 8 Gen 2 which is used in latest flagship smartphones. For this we will use Qualcomm Innovators Development Kits sponsored by Qualcomm.