Winter is coming to an end and our lab already started preparing the summer semester. This year we will teach the two classes High Performance Computing (HPC) and Efficient Machine Learning (EML). HPC will be in the third instantiation while EML will be offered a second time. We are eager to keep the classes up to date by integrating latest research and technology.

In the case of the HPC class, the Neoverse V1 microarchitecture used in the Graviton3 server processors will be covered in detail. Over the duration of the summer semester, a special emphasis will be put on the newly introduced SVE Bfloat16 vector instructions and their use when writing fast matrix-matrix multiplication kernels.

EML will also receive major updates. First, we will integrate some of the latest features coming with PyTorch 2. Especially the new software ecosystem behind torch.compile is a key development when targeting efficient machine learning. Second, we will extend the class’s scope by also discussing inference on mobile devices. Obtaining high performance on mobile devices typically requires a technique called quantization. Currently, we plan to target the Snapdragon 8 Gen 2 system on chip which is used in latest flagship smartphones. Students will have access to Qualcomm Innovators Development Kits sponsored by Qualcomm.