Assignments from the course Fundamentals of Parallelism on Intel Architecture. Wrote parallel and distributed C++ code using OpenMP and MPI optimized for specialized Intel high-performance processors (Xeon and Xeon Phi family) to exploit their internal parallelism (up to 72 cores).
- Wrote vectorized code for Monte Carlo Diffusion
- Multithreaded the code for filtering
- Optimized the Fast-Fourier Transform code using special (High-Bandwidth / cache) memories
- Distributed the physics simulation of a string vibration using MPI. Ran and tested the code on a remote cluster, of which access was provided by coursera.