GithubHelp home page GithubHelp logo

matpalm / cached_dilated_causal_convolutions Goto Github PK

View Code? Open in Web Editor NEW
6.0 6.0 0.0 7.67 MB

1D dilated causal convolutions with extreme caching for 5µs inference on an FPGA

Home Page: https://matpalm.com/blog/wavenet_on_fpga/

License: MIT License

C++ 0.74% Jupyter Notebook 91.31% Python 5.27% Makefile 0.24% C 0.40% Shell 0.11% SystemVerilog 1.94%

cached_dilated_causal_convolutions's People

Contributors

matpalm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

cached_dilated_causal_convolutions's Issues

HP filter for discontinuity cases?

demo with ramp or square as core input shows how glitcvhy it is with a discontinuity
maybe a HP filter would help? can prototype by trying with other euro filter and if it looks worth it, can add in module from eurorack pmod for it.

generate synthetic training data

the current project builds on a previous one where i was collecting data for another task. but if the goal is just wave shaping the initital training data can be simply generated from scratch synthetically ( with augmentation noise etc )

optimise implementation

po2

  • i'm 90% sure the way i've done the po2 multiply, especially w.r.t using single bits in memory for is_negative etc, could be done in a much more effecient way. it feels wrong to optimise something to the level of shift operators, but require a lookup for the value to shift by ( maybe i'm missing something re: compilation etc )
    • note: an extreme value of this could be writing specific modules per multiply with the shift and ( possible negation ) logic built in...

qb

  • in the qb_ network we have >50% free cycles, but not many free multiple DSPs. so how best to restructure the modules so the same module can be clocked twice with different sets of weights?

reuse inner conv

the v1.0 release code is clumsy and even though we have cycles left we can't add another conv.

but we can reuse one using tied weights

so we instead of training input -> conv0 -> output we could train ``input -> conv0 -> conv0 -> output` by just reusing qconv layer which will train with tied weights, then at inference time we just need 1 additional activation cache and the ability to switch in/out for the inner cache ( which i guess is done with registering? )

skip next sample when network still running?

current v1.0 behaviour is to reset network each sample_clk
but if the network hasn't finished running, this means we just reset and never emit anything

should we set/check a network_running bit so that if sample_clk occurs we can decide to skip it?
or will this just then the same as running at half the clock speed?

do eurorack pmod calibration properly

couldn't work out the right way to do calibration so ended up with a hack on both the verilog and training side :/

from notes ( see 2023 10 17 )

sending +5V => 20,000    ( that's from the COUNT_PER_VOLT = 4000 ) 

...

just realised that +/-5V => +/- 20_000
could be mapped to +/-5V => +/- 5_000 with  >>> 2;
and then back with << 2 ?

this works. and 5000 is close to 4096...

so can we need to rescale on the way in?

i.e. we want 5V = 16_384    ( since 16384 >> 2 = 4096 )
so we want to 0.8192

currently 16384 is 4.096V
so we want to divide everything in the data by 

in fact, even simpler....

if we map everything by >>> 2 coming in  ( and by << 2 on way out )
then we have 5V = 5000 = 0x1388 = 1.220703125

the data is based on 8V=1.0, so 5V = 0.625
so we just need to multiply all the data by 1.220703125/(5/8)=1.953125 
during training....

means the max the net will take, or output, is +/- 1.22 but this should be fine and is well with FP4.12

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.