This is a hardware implementation to accelerate matrix and vector operations to enhance AI computation. This project include two parts:
- GEMM(verilog): to accelerate matrix operations
- VPU(hls): to accelerate vector operations
Initial funtionality test has been finished on FPGA for both block. More validation is working in progress.