# Tensor Computations (M.Sc.)

The seminar will take place weekly in an 1 1/2h time slot. All of our meetings will be face-to-face.

The general format of the seminar is similar to that of a reading group. This means that all participants read the paper before attending the respective sessions. A single person, either a student or somebody of the teaching staff, becomes an expert in the topic. This person presents the topic in 30 minutes and leads the discussion afterwards.

## Student Papers

Each participant is required to write a scientific paper on their selected seminar topic. The paper has to be submitted via email four weeks after the respective topic was discussed in the seminar. Use the ACM proceedings template with the sigconf option for your paper. The length of the paper should be between 7 and 10 pages (excl. references). You may write your paper in either English or German.

## Supervision

Preparing presentations and composing scientific papers can be challenging. You may ask for advise at any point in time! Start early and keep in touch with your advisor!

- Two meetings with your advisor are mandatory:
The first meeting should be at least one week before your presentation.

The second meeting should be at least one week before your paper submission deadline.

## Schedule

Date |
What? |
---|---|

10/16 |
Kickoff |

10/30 |
Anatomy of High-Performance Matrix Multiplication |

11/06 |
BLIS: A Framework for Rapidly Instantiating BLAS Functionality |

11/13 |
A Design of a High-Performance GEMM-like Tensor-Tensor Multiplication |

11/20 |
High-Performance Tensor Contraction without Transposition |

12/11 |
The Tensor Algebra Compiler |

12/18 |
A Configurable Cloud-Scale DNN Processor for Real-Time AI |

02/05 |
Ansor: generating high-performance tensor programs for deep learning |

## Topics

Select any of the following papers as your seminar topic. Additionally, you may also suggest your own paper. Topics will be given out on a first come, first served basis.

Anatomy of High-Performance Matrix Multiplication (preprint)

BLIS: A Framework for Rapidly Instantiating BLAS Functionality (paper)

High-Performance Tensor Contraction without Transposition (paper)

LazyTensor: combining eager execution with domain-specific compilers (preprint)

A Design of a High-Performance GEMM-like Tensor-Tensor Multiplication (preprint, paper)

SpDISTAL: Compiling Distributed Sparse Tensor Computations (paper)

The Tensor Algebra Compiler (paper)

HAOTuner: A Hardware Adaptive Operator Auto-Tuner for Dynamic Shape Tensor Compilers (paper)

GraphTensor: Comprehensive GNN-Acceleration Framework for Efficient Parallel Processing of Massive Datasets (paper)

An Efficient 2D Method for Training Super-Large Deep Learning Models (paper)

Dynamic Tensor Linearization and Time Slicing for Efficient Factorization of Infinite Data Streams (paper)

Optimizing High-Performance Linpack for Exascale Accelerated Architectures (preprint)

HighLight: Efficient and Flexible DNN Acceleration with Hierarchical Structured Sparsity (preprint)

A Configurable Cloud-Scale DNN Processor for Real-Time AI (paper)

Ansor: generating high-performance tensor programs for deep learning (paper)