Running C+=AB^T benchmark num_threads: 2 QoS: User Interactive num_reps: 20000000 M: 32 N: 32 K: 32 Max absolute error: 0 Max relative error: 0 Accelerate Duration: 2.17408 s Accelerate Performance: 1205.77 GFLOPS Kernel Duration: 2.07495 s Kernel Performance: 1263.37 GFLOPS