Hall of Fame

Winter 25/26

Raspberry Pi 5 (M=8, N=4, K=4)

This section lists the top-performing implementations of the single precision GEMM kernel \(C\mathrel{+}=AB^T\) with M=8, N=4, K=4. The matrices are \(A \in \mathbb{R}^{8 \times 4}\), \(B \in \mathbb{R}^{4 \times 4}\), and \(C \in \mathbb{R}^{8 \times 4}\). Matrix \(A\) is stored in column-major format, \(B\) in row-major format, and \(C\) in column-major format. The kernels use ASIMD (Neon) vector instructions.

Table 1 Sustained FP32 GFLOPS on Raspberry Pi 5 core (Cortex-A76).

Team

GFLOPS

Steife Hörnchen

25.4

Pirates of the Assemblian

25.3

The FLOPpers

24.0

Raspberry Pi 5 (M=8, N=4, K=64)

This section lists the top-performing implementations of the single precision GEMM kernel \(C\mathrel{+}=AB^T\) with M=8, N=4, K=64. The matrices are \(A \in \mathbb{R}^{8 \times 64}\), \(B \in \mathbb{R}^{64 \times 4}\), and \(C \in \mathbb{R}^{8 \times 4}\). Matrix \(A\) is stored in column-major format, \(B\) in row-major format, and \(C\) in column-major format. The kernels use ASIMD (Neon) vector instructions.

Table 2 Sustained FP32 GFLOPS on Raspberry Pi 5 core (Cortex-A76).

Team

GFLOPS

Steife Hörnchen

36.2

SIMD-Symphonie

32.9