GithubHelp home page GithubHelp logo

sustech-accetech's Introduction

SUSTECH-AcceTech

This repository provides the public references for accelerating AI SOC with high energy-efficient performance. The main applications are focusing on HPC (high performance computing), NAS (neural architecture search) and low-power edge computing.

Table of Contents

Papers

Computing

Mixed-Precision

  • 2018 | BISMO: A Scalable Bit-Serial Matrix Multiplication Overlay for Reconfigurable Computing | NTNU Group | FPL | PDF
  • 2018 | Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Networks | H. Sharma, et al. | ISCA | PDF
  • 2018 | Harnessing GPU Tensor Cores for Fast FP16 Arithmetic to Speed up Mixed-Precision Iterative Refinement Solvers | A. Haidar, et al. | Supercomputing | PDF
  • 2018 | Mixed-precision in-memory computing | Manuel Le Gallo, et al. | Nature Electronics | PDF
  • 2018 | DNPU: An Energy-Efficient Deep-Learning Processor with Heterogeneous MultiCore Architecture | KAIST Group | Micro | PDF look-up Table
  • 2018 | DNPU: An 8.1TOPS/W Reconfigurable CNN-RNN Processor for General-Purpose Deep Neural Networks | KAIST Group | ISSCC | PDF look-up Table
  • 2018 | Efficient Fixed/Floating-Point Merged Mixed-Precision Multiply-Accumulate Unit for Deep Learning Processors | H. Zhang, et al. | ISCAS | PDF
  • 2018 | Mixed Precision Training | Baidu | ICLR | PDF
  • 2017 | Bit-Pragmatic Deep Neural Network Computing | UToronto Group | Micro | PDF
  • 2016 | Stripes: Bit-Serial Deep Neural Network Computing | UToronto Group | Micro | PDF

Multiplication

  • 2015 | Performance Analysis of Karatsuba Multiplication Algorithm for Different Bit Lengths | C. Eyupoglu | PDF

  • 2015 | An Efficient Floating Point Multiplier Design for High Speed Applications using Karatsuba Algorithm and Urdhva-Tiryagbhyam Algorithm | S. Arish, et al. | ICSC | PDF

  • 2014 | An Efficient Baugh-Wooley Multiplication Algorithm for 32-bit Synchronous Multiplication | PDF

  • 2013 | Implementation of Baugh-Wooley Multiplier Based on Soft-Core Processor | PDF

  • 2012 | An Efficient Baugh-Wooley Architecture for Both Signed & Unsigned Multiplication | PDF

  • 2012 | A High Speed Wallace Tree Multiplier Using Modified Booth Algorithm for Fast Arithmetic Circuits | PDF

  • 201x | title | author, et al. | conf. or Journal | PDF

NAS

  • 2019 | HAQ: Hardware-Aware Automated Quantization with Mixed Precision | Song Han Group | CVPR | PDF

sustech-accetech's People

Contributors

greatmao-ai avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.