Tensor Compilers (M.Sc.)

The seminar meets weekly for 1 1/2 hours. The general format of the seminar is similar to a reading group. In other words, all participants read the paper before attending each session. One individual, either a student or a member of the teaching staff, becomes an expert on the topic. This person presents the topic for 30 minutes and then leads the discussion.

For the first five sessions after the kickoff, we start with an introductory block. This means that we read important papers on fast matrix-matrix multiplications, fast tensor contractions, and an overview of DNN compilers. These explain the basic properties and terminology needed for the rest of the seminar. We then discuss recent research on tensor compilers.

Student Papers

Each student is required to write a scientific paper on the topic chosen for the seminar. The paper must be submitted by email four weeks after the topic is presented in the seminar. Use the ACM proceedings template with the sigconf option for your paper. The length of the paper should be between 7 and 10 pages (excluding references). You may write your paper in English or German.

Supervision

Preparing presentations and writing scientific papers can be challenging. You can always ask for advice! Start early and stay in touch with your advisor!

Two meetings with your advisor are required:

The first meeting should be at least one week before your presentation.
The second meeting should be at least one week before your paper is due.

Generative AI

The use of generative AI in any capacity is strictly prohibited. Write the paper yourself!

Schedule

Date	What?
10/14	Kickoff
10/21	Anatomy of High-Performance Matrix Multiplication
10/28	BLIS: A Framework for Rapidly Instantiating BLAS Functionality
11/04	A Design of a High-Performance GEMM-like Tensor-Tensor Multiplication
11/11	High-Performance Tensor Contraction without Transposition
11/25	The Deep Learning Compiler: A Comprehensive Survey
12/02	The Tensor Algebra Compiler
01/06	Optimal Kernel Orchestration for Tensor Programs with Korch
01/13	A Match Made in Silicon: The Co-Evolution of Systems and AI
01/20	A Tensor Compiler for Unified Machine Learning Prediction Serving
01/27	Immense-Scale Machine Learning
02/03	Herding Llamas

Topics

Anatomy of High-Performance Matrix Multiplication (preprint)
BLIS: A Framework for Rapidly Instantiating BLAS Functionality (paper)
A Design of a High-Performance GEMM-like Tensor-Tensor Multiplication (preprint, paper)
High-Performance Tensor Contraction without Transposition (paper)
The Deep Learning Compiler: A Comprehensive Survey (paper)
A Match Made in Silicon: The Co-Evolution of Systems and AI (keynote)
Immense-Scale Machine Learning: The Big, the Small, and the Not Right at All (recording)
Herding Llamas: A Sneak Peek Into Meta’s Infrastructure for Generative AI (recording)

Choose one of the following papers as your seminar topic. You may also propose your own topic. Topics will be assigned on a first-come, first-served basis.

Ansor: generating high-performance tensor programs for deep learning (paper)
A Code Generator for High-Performance Tensor Contractions on GPUs (paper)
A Tensor Compiler for Unified Machine Learning Prediction Serving (paper
DLA: Compiler and FPGA Overlay for Neural Network Inference Acceleration (paper)
oneDNN Graph Compiler: A Hybrid Approach for High-Performance Deep Learning Compilation (preprint
Optimal Kernel Orchestration for Tensor Programs with Korch (paper)
Optimizing Deep Learning Inference via Global Analysis and Tensor Expressions (paper)
The Tensor Algebra Compiler (paper)
Triton: An Intermediate Language and Compiler for Tiled Neural Network Computations (preprint)
TensorIR: An Abstraction for Automatic Tensorized Program Optimization (paper)