.. _ch:reading_group:

Reading Group
=============

The lab organizes an informal reading group where we study research contributions of interest to us.
You may join the reading group if you are interested!
Simply `get in touch <https://matrix.to/#/#scalable:uni-jena.de>`__ for details!

2024 Schedule
-------------

+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Date  | Topic                                                                                                                                                            |
+=======+==================================================================================================================================================================+
| 04/25 | Memory-Efficient Fine-Tuning of Compressed                                                                                                                       |
|       |                                                                                                                                                                  |
|       | Large Language Models via sub-4-bit Integer Quantization                                                                                                         |
|       | (`paper <https://proceedings.neurips.cc/paper_files/paper/2023/hash/7183f4fc87598f6c6e947b96714acbd6-Abstract-Conference.html>`__)                               |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 04/18 | Spectre Attacks: Exploiting Speculative Execution                                                                                                                |
|       |                                                                                                                                                                  |
|       | (`paper <https://ieeexplore.ieee.org/document/8835233>`__)                                                                                                       |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 04/11 | The Deep Learning Compiler:                                                                                                                                      |
|       |                                                                                                                                                                  |
|       | A Comprehensive Survey (`paper <https://ieeexplore.ieee.org/document/9222299>`__)                                                                                |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 04/04 | Large Language Models for Compiler Optimization                                                                                                                  |
|       | (`preprint <https://arxiv.org/pdf/2309.07062.pdf>`__)                                                                                                            |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 03/27 | FlashAttention: Fast and Memory-Efficient                                                                                                                        |
|       |                                                                                                                                                                  |
|       | Exact Attention with IO-Awareness                                                                                                                                |
|       | (`paper <https://proceedings.neurips.cc/paper_files/paper/2022/file/67d57c32e20fd0a7a302cb81d36e40d5-Supplemental-Conference.pdf>`__)                            |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 03/20 | Peer Review Session                                                                                                                                              |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 03/14 | TensorIR: An Abstraction for Automatic                                                                                                                           |
|       |                                                                                                                                                                  |
|       | Tensorized Program Optimization (`paper <https://dl.acm.org/doi/abs/10.1145/3575693.3576933>`__)                                                                 |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 03/06 | FP8 Quantization: The Power of the Exponent                                                                                                                      |
|       | (`paper <https://proceedings.neurips.cc/paper_files/paper/2022/file/5e07476b6bd2497e1fbd11b8f0b2de3c-Paper-Conference.pdf>`__)                                   |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 02/28 | Novel adaptive quantization methodology                                                                                                                          |
|       |                                                                                                                                                                  |
|       | for 8-bit floating-point DNN training (`paper <https://link.springer.com/article/10.1007/s10617-024-09282-2>`__)                                                 |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 02/21 | A Tensor Compiler for Unified Machine Learning                                                                                                                   |
|       |                                                                                                                                                                  |
|       | Prediction Serving (`paper <https://www.usenix.org/conference/osdi20/presentation/nakandala>`__)                                                                 |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 02/14 | LoopTune: Optimizing Tensor Computations                                                                                                                         |
|       |                                                                                                                                                                  |
|       | with Reinforcement Learning (`preprint <https://arxiv.org/abs/2309.01825>`__)                                                                                    |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 02/07 | A massively parallel tensor contraction framework                                                                                                                |
|       |                                                                                                                                                                  |
|       | for coupled-cluster computations (`paper <https://doi.org/10.1016/j.jpdc.2014.06.002>`__)                                                                        |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 01/31 | LoopStack: a Lightweight Tensor Algebra Compiler Stack (`preprint <https://arxiv.org/abs/2205.00618>`__)                                                         |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 01/24 | Towards an efficient use of the BLAS library                                                                                                                     |
|       |                                                                                                                                                                  |
|       | for multilinear tensor contractions (`paper <https://www.sciencedirect.com/science/article/pii/S0096300314002902>`__)                                            |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 01/17 | Chapter 5.7: Efficient Processing of Deep Neural Networks (`book <https://link.springer.com/book/10.1007/978-3-031-01766-7>`__)                                  |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+

2023 Schedule
-------------

+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Date  | Topic                                                                                                                                                            |
+=======+==================================================================================================================================================================+
| 12/19 | oneDNN Graph Compiler: A Hybrid Approach for High-Performance                                                                                                    |
|       |                                                                                                                                                                  |
|       | Deep Learning Compilation (`preprint <https://arxiv.org/ftp/arxiv/papers/2301/2301.01333.pdf>`__)                                                                |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 12/12 | Chapter 5.1 - 5.6: Efficient Processing of Deep Neural Networks (`book <https://link.springer.com/book/10.1007/978-3-031-01766-7>`__)                            |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 12/05 | RISC-V Composable Extensions for MX Microscaling Data Formats                                                                                                    |
|       |                                                                                                                                                                  |
|       | for AI Tensors: Part One: Introduction to MX Data                                                                                                                |
|       | (`blog post <https://fpga.org/2023/11/27/risc-v-composable-extensions-for-microscaling-data-formats-for-ai-tensors/>`__)                                         |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 11/28 | Chapter 4: Efficient Processing of Deep Neural Networks (`book <https://link.springer.com/book/10.1007/978-3-031-01766-7>`__)                                    |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 11/22 | Chapter 3: Efficient Processing of Deep Neural Networks (`book <https://link.springer.com/book/10.1007/978-3-031-01766-7>`__)                                    |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 11/14 | HighLight: Efficient and Flexible DNN Acceleration                                                                                                               |
|       |                                                                                                                                                                  |
|       | with Hierarchical Structured Sparsity (`preprint <https://arxiv.org/abs/2305.12718>`__)                                                                          |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 11/07 | Higher-dimensional processing using a photonic tensor core                                                                                                       |
|       |                                                                                                                                                                  |
|       | with continuous-time data (`paper <https://www.nature.com/articles/s41566-023-01313-x>`__)                                                                       |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 11/01 | Toward Matrix Multiplication for Deep Learning Inference                                                                                                         |
|       |                                                                                                                                                                  |
|       | on the Xilinx Versal (`paper <https://ieeexplore.ieee.org/document/10136983>`__)                                                                                 |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 10/24 | Optimizing Direct Convolutions on ARM Multi-Cores (`preprint <https://eprints.whiterose.ac.uk/202768/1/sc23-2.pdf>`__)                                           |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 08/28 | Hot Chips 2023 watch party (`program <https://hotchips.org/advance-program/#conference-day-1-monday-august-28-2023>`__)                                          |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 07/12 | DGEMM on Integer Matrix Multiplication Unit (`preprint <https://arxiv.org/pdf/2306.11975.pdf>`__)                                                                |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 07/05 | A Design of a High-Performance GEMM-like                                                                                                                         |
|       |                                                                                                                                                                  |
|       | Tensor-Tensor Multiplication (`preprint <https://arxiv.org/pdf/1607.00145.pdf>`__, `paper <https://dl.acm.org/doi/10.1145/3157733>`__)                           |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 06/28 | High-Performance Tensor Contraction without Transposition (`paper <https://dl.acm.org/doi/10.1137/16M108968X>`__)                                                |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 06/21 | Can Computers Learn Common Sense? (`article <https://www.newyorker.com/tech/annals-of-technology/can-computers-learn-common-sense>`__)                           |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 06/14 | Dynamo: amazon's highly available key-value store (`paper <https://dl.acm.org/doi/pdf/10.1145/1323293.1294281>`__)                                               |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 06/07 | A White Paper on Neural Network Quantization (`white paper <https://arxiv.org/abs/2106.08295>`__)                                                                |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 05/31 | LazyTensor: combining eager execution with                                                                                                                       |
|       |                                                                                                                                                                  |
|       | domain-specific compilers (`preprint <https://arxiv.org/abs/2102.13267>`__)                                                                                      |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 05/24 | Neural Galerkin Scheme with Active Learning for High-Dimensional                                                                                                 |
|       |                                                                                                                                                                  |
|       | Evolution Equations (`preprint <https://arxiv.org/pdf/2203.01360.pdf>`__)                                                                                        |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 05/17 | Architecture and Performance of Devito, a System for Automated                                                                                                   |
|       |                                                                                                                                                                  |
|       | Stencil Computation (`paper <https://dl.acm.org/doi/abs/10.1145/3374916>`__)                                                                                     |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 05/10 | Efficient Design Space Exploration for Sparse Mixed Precision Neural                                                                                             |
|       |                                                                                                                                                                  |
|       | Architectures (`paper <https://dl.acm.org/doi/10.1145/3502181.3531463>`__)                                                                                       |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 05/03 | BLIS: A Framework for Rapidly Instantiating BLAS Functionality (`paper <https://dl.acm.org/doi/pdf/10.1145/2764454>`__)                                          |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 04/26 | Anatomy of High-Performance Matrix Multiplication (`preprint <https://www.cs.utexas.edu/~flame/pubs/GotoTOMS_revision.pdf>`__)                                   |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 04/19 | Harnessing Deep Learning and HPC Kernels via High-Level Loop and                                                                                                 |
|       |                                                                                                                                                                  |
|       | Tensor Abstractions on CPU Architectures (`preprint <https://arxiv.org/abs/2304.12576>`__)                                                                       |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 04/12 | Tensor Contractions Tutorial (`tutorial <https://www.tensors.net/tutorial-1>`__)                                                                                 |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 03/20 | Speculative Vectorisation with Selective Replay (`paper <https://ieeexplore.ieee.org/document/9499938>`__)                                                       |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 03/14 | An Attack on The Speculative Vectorization: Leakage from Higher                                                                                                  |
|       |                                                                                                                                                                  |
|       | Dimensional Speculation (`preprint <https://arxiv.org/abs/2302.01131>`__)                                                                                        |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 03/07 | DLA: Compiler and FPGA Overlay for Neural Network Inference                                                                                                      |
|       |                                                                                                                                                                  |
|       | Acceleration (`paper <https://www.computer.org/csdl/proceedings-article/fpl/2018/851700a411/17D45WgziSO>`__)                                                     |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 02/24 | MLPerf Mobile Inference Benchmark (`preprint <https://arxiv.org/abs/2012.02328>`__)                                                                              |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 02/03 | Massively parallel universal linear transformations using a                                                                                                      |
|       |                                                                                                                                                                  |
|       | wavelength-multiplexed diffractive optical network (`paper <https://arxiv.org/abs/2208.10362>`__)                                                                |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 01/27 | Efficient Quantized Sparse Matrix Operations on Tensor Cores (`paper <https://dl.acm.org/doi/abs/10.5555/3571885.3571934>`__)                                    |
+-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+