GithubHelp home page GithubHelp logo

lkrnmyo / butterfly_acc Goto Github PK

View Code? Open in Web Editor NEW

This project forked from samsunglabs/butterfly_acc

0.0 0.0 0.0 260 KB

The codes and artifacts associated with our MICRO'22 paper titled: "Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design"

Shell 1.12% Python 40.82% Tcl 3.69% Verilog 48.34% SystemVerilog 6.03%

butterfly_acc's Introduction

NPU Transformer

Hardware Architecture/Accelerator for Transformer. The latest codes and updates are maintainted in this github repo.

1. Structure

.
├── README.md
├── figs                 # All the scripts to generate figures in the paper
├── hardware             # All code related to hardware implementation on an FPGA
    ├── cpu_gpu_perf     # Code for evaluate hardware performance on cpu and gpu
    ├── npu_design       # Code for evaluating our butterfly accelerator
        ├── simulator    # Our custom simulator/performance model to get latency
        ├── verilog      # Verilog code of our hardware implementation
            ├── functionality   # Design and Testbench to evaluate the functionality of hardware
            ├── VCU128          # Design and Constraints for synthesis, place&route on VCU128
            ├── Zynq7045        # Design and Constraints for synthsis, place&route on Zynq7045
├── requirements.txt        
└── software          # All code related to software experiments   
    ├── speed         # Code for speed testing
    ├── accuracy      # Code for training/inference to get accuracy

2. Environment Setup

2.1 Install Dependencies

We use conda to manage the required enviroment

conda create -n npu_transformer python=3.8 scipy
source activate npu_transformer
pip3 install -r requirements.txt

Then, install the butterfly operation(hand-written CUDA optimization has included in their repo).

git clone https://github.com/HazyResearch/butterfly.git
cd butterfly/
python setup.py install

To compare the speed of dense linear and butterfly linear, run:

cd NPU_Transformer/software
python -m src.bflylr

2.2 Prepare Dataset

Download lra_release.gz released by LRA repo and place the unzipped folder in folder:

NPU_Transformer/software/src/LRA/datasets

Then, run sh create_datasets.sh in NPU_Transformer/software/src/LRA/datasets and it will create train, dev, and test dataset pickle files for each task.

To run experiment, go to NPU_Transformer/software/src/LRA/code. For instance, running the command running listop is in benchmarking_listop.sh.

3. Artifact Evaluation

3.1 Functionality

3.1.1. Verilog Design

All the testbenches (Butterfly matrix multiplication, FFT, Layer normalization, Shortcut addition) are in hardware/npu_design/verilog/functionality/testbench/. Pls read instruction before running, where you need to generate test data first.

3.1.2. Training and Evaluation of FABNet on CPU/GPU

The code and scripts of evaluting speed are put under software/speed/. The code and scripts of evaluating accuracy are put under software/accuracy/.

3.2 Reproducibility

All the scripts are put under the folder ./script_figs, we refer reviewers/users to read instruction before running it. Some training may take over hunderds of GPU hours to finish, so we also attach our log files in google drive link.

Citation

Our paper is online now (link)! If you found it helpful, pls cite us using:

@inproceedings{fan2022adaptable,
  title={Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design},
  author={Fan, Hongxiang and Chau, Thomas and Venieris, Stylianos I and Lee, Royson and Kouris, Alexandros and Luk, Wayne and Lane, Nicholas D and Abdelfattah, Mohamed S},
  booktitle={MICRO-55: 55th Annual IEEE/ACM International Symposium on Microarchitecture},
  year={2022}
}

butterfly_acc's People

Contributors

os-hxfan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.