Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm
-
Updated
Jun 5, 2024 - C++
Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm
Floating-point matrix multiplication implementation (arbitrary precision)
ForMatmul - A Fortran library that overloads the matmul function to enable efficient matrix multiplication with/without coarray.
In this project, ınstruction numbers from a c program are counted with pin and c++.
OpenMP Matrix Multiplication Offloading Playground
Raspberry Pi Pico (RP2040) and Adafruit Metro M7 (NXP IMXRT10XX) benchmark
📰 This repository contains time measurements of various algorithms on the CPU and GPU using PyCuda: matrix multiplication, Pi computation, and bilateral filtering.
Matrix-matrix multiplication implementations benchmarking
Add a description, image, and links to the matmul topic page so that developers can more easily learn about it.
To associate your repository with the matmul topic, visit your repo's landing page and select "manage topics."