.. _ch:reading_group: Reading Group ============= The lab organizes an informal reading group where we study research contributions of interest to us. You may join the reading group if you are interested! Simply `get in touch `__ for details! 2024 Schedule ------------- +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Date | Topic | +=======+==================================================================================================================================================================+ | 04/25 | Memory-Efficient Fine-Tuning of Compressed | | | | | | Large Language Models via sub-4-bit Integer Quantization | | | (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 04/18 | Spectre Attacks: Exploiting Speculative Execution | | | | | | (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 04/11 | The Deep Learning Compiler: | | | | | | A Comprehensive Survey (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 04/04 | Large Language Models for Compiler Optimization | | | (`preprint `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 03/27 | FlashAttention: Fast and Memory-Efficient | | | | | | Exact Attention with IO-Awareness | | | (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 03/20 | Peer Review Session | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 03/14 | TensorIR: An Abstraction for Automatic | | | | | | Tensorized Program Optimization (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 03/06 | FP8 Quantization: The Power of the Exponent | | | (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 02/28 | Novel adaptive quantization methodology | | | | | | for 8-bit floating-point DNN training (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 02/21 | A Tensor Compiler for Unified Machine Learning | | | | | | Prediction Serving (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 02/14 | LoopTune: Optimizing Tensor Computations | | | | | | with Reinforcement Learning (`preprint `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 02/07 | A massively parallel tensor contraction framework | | | | | | for coupled-cluster computations (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 01/31 | LoopStack: a Lightweight Tensor Algebra Compiler Stack (`preprint `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 01/24 | Towards an efficient use of the BLAS library | | | | | | for multilinear tensor contractions (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 01/17 | Chapter 5.7: Efficient Processing of Deep Neural Networks (`book `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 2023 Schedule ------------- +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Date | Topic | +=======+==================================================================================================================================================================+ | 12/19 | oneDNN Graph Compiler: A Hybrid Approach for High-Performance | | | | | | Deep Learning Compilation (`preprint `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 12/12 | Chapter 5.1 - 5.6: Efficient Processing of Deep Neural Networks (`book `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 12/05 | RISC-V Composable Extensions for MX Microscaling Data Formats | | | | | | for AI Tensors: Part One: Introduction to MX Data | | | (`blog post `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 11/28 | Chapter 4: Efficient Processing of Deep Neural Networks (`book `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 11/22 | Chapter 3: Efficient Processing of Deep Neural Networks (`book `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 11/14 | HighLight: Efficient and Flexible DNN Acceleration | | | | | | with Hierarchical Structured Sparsity (`preprint `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 11/07 | Higher-dimensional processing using a photonic tensor core | | | | | | with continuous-time data (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 11/01 | Toward Matrix Multiplication for Deep Learning Inference | | | | | | on the Xilinx Versal (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 10/24 | Optimizing Direct Convolutions on ARM Multi-Cores (`preprint `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 08/28 | Hot Chips 2023 watch party (`program `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 07/12 | DGEMM on Integer Matrix Multiplication Unit (`preprint `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 07/05 | A Design of a High-Performance GEMM-like | | | | | | Tensor-Tensor Multiplication (`preprint `__, `paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 06/28 | High-Performance Tensor Contraction without Transposition (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 06/21 | Can Computers Learn Common Sense? (`article `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 06/14 | Dynamo: amazon's highly available key-value store (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 06/07 | A White Paper on Neural Network Quantization (`white paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 05/31 | LazyTensor: combining eager execution with | | | | | | domain-specific compilers (`preprint `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 05/24 | Neural Galerkin Scheme with Active Learning for High-Dimensional | | | | | | Evolution Equations (`preprint `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 05/17 | Architecture and Performance of Devito, a System for Automated | | | | | | Stencil Computation (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 05/10 | Efficient Design Space Exploration for Sparse Mixed Precision Neural | | | | | | Architectures (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 05/03 | BLIS: A Framework for Rapidly Instantiating BLAS Functionality (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 04/26 | Anatomy of High-Performance Matrix Multiplication (`preprint `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 04/19 | Harnessing Deep Learning and HPC Kernels via High-Level Loop and | | | | | | Tensor Abstractions on CPU Architectures (`preprint `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 04/12 | Tensor Contractions Tutorial (`tutorial `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 03/20 | Speculative Vectorisation with Selective Replay (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 03/14 | An Attack on The Speculative Vectorization: Leakage from Higher | | | | | | Dimensional Speculation (`preprint `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 03/07 | DLA: Compiler and FPGA Overlay for Neural Network Inference | | | | | | Acceleration (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 02/24 | MLPerf Mobile Inference Benchmark (`preprint `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 02/03 | Massively parallel universal linear transformations using a | | | | | | wavelength-multiplexed diffractive optical network (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | 01/27 | Efficient Quantized Sparse Matrix Operations on Tensor Cores (`paper `__) | +-------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+