GithubHelp home page GithubHelp logo

Comments (7)

amirarjmand93 avatar amirarjmand93 commented on June 30, 2024 1

Issues with Smaller Multipliers:

I've made some progress in handling multiplier input size less than "max hard block bit width"(25) and greater "min hard block bit width(18)". The changes are located within the pad_multiplier function.
Now it can pass a variety of multipliers of any input size in the design file(diffeq2.v) and turns out OK. but i'm not sure about Pin mapping and node connection functionality.
please take a look @WhiteNinjaZ .
multiplier_v1.txt

from vtr-verilog-to-routing.

vaughnbetz avatar vaughnbetz commented on June 30, 2024

@amirarjmand93

from vtr-verilog-to-routing.

WhiteNinjaZ avatar WhiteNinjaZ commented on June 30, 2024

Here is a little more detail into what I have found:
Using Valgrind and several print statements, it looks like the issue stems from split_multiplier() function in multiplier.cc. From what I can tell, the problem is that when this function tries to map a 32x32 multiply from diffeq2.v onto the 18x25 multiplier. In this scenario, node->num_output_pins=32+32=64, b0=25 and addbig->num_output_pins=addsmall->num_output_pins=a1b0->num_output_pins + 1=(32-18)+25+1=40. In the final for loop that remaps the pins to the new node the elements b0-1 to b0-1+addbig->num_output_pins are accessed. Since b0-1+addbig->num_output_pins = 64 > node->num_output_pins - 1 we exceed our array bounds by one. If I did things right, I believe the root of the problem is that the assumption (stated in the header for the split_multiplier function) that the multiplication can be balanced to remove an addition only holds if the multiplier to be mapped to contains inputs with equal port widths. Removing this assumption by inserting an if statement and an extra multiplier when the multiplier ports where unequal, I was able to successfully compile diffeq2.v. However, I think I may have connected the nodes together incorrectly as some of the larger benchmarks (i.e. mcml.v) fail the parmys pass. @amirarjmand93 if you could take a look that would be great.

Here are my changes to the multplier.cc file. Basically all my changes are in split_multiplier funciton:
multiplier.txt

from vtr-verilog-to-routing.

amirarjmand93 avatar amirarjmand93 commented on June 30, 2024

I am exactly approaching split_multiplier function as well as you. also, I have found if the bit width of the desired multiplier (verilog file) becomes lower than the minimum bit width of the designed hard block(architecture file), it turns out correct synthesis. for example, when trying to map 16x16 multiplied (modified diffeq2.v) by an 18x25 multiplier (architecture file), it is OK. but the error turns out when trying a greater number like 20x20 or 32x32 (modified diffeq2.v) into an 18x25 multiplier (architecture file).

my other concern falls in the modified arch file. I tried any manipulation in the test_multiplier_size.xml but I didn't get any error related to modified architecture. Also, Is the modified version of 'k6_frac_N10_frac_chain_mem32K_40nm.xml' a valid one?

from vtr-verilog-to-routing.

amirarjmand93 avatar amirarjmand93 commented on June 30, 2024

As a quick update, I want to clarify some parts of the issue:

The core of problems fall in iterate_multipliers(netlist_t *netlist) in multiplier.cc file.

Arch file : test_multiplier_size.xml (fixed 25x18)

Design file: diffeq2.v (modified)


Success with Large Multipliers:

I checked the multiplier.txt. I think we have been approaching with the same idea on replacing "Concatinating" methodology((a1 * b1) . (a0 * b0)) with "addsmall2" ((a1 * b1) + (a0 * b0)). Now, calling split_multiplier(node, a0, b0, a1, b1, netlist) function can handle Multiplicand greater than max hard block bit width (25). good job.
so 26x26, 27x27, ... , 35x35 multipliers can be passed well. (35x35 is maximum allowed -> see section 3 )
Blank diagram

Issues with Smaller Multipliers:

The problem still persssit for multiplier bit width less than 25(max hard block bit width). so designs with multipliers' bit widths lesser than 25 like 24x24,23x23,...,18x18, cannot be passed through the 25*18 hard block correctly. this error stems from calling pad_multiplier(node, netlist); . inside this function, we have a variable by the name "diffb" which goes to a negative value and the oassert function terminates the program by error.

Issues with Handling Very Large Multipliers:

I think the mcml.v file contains 64x64 multiplication . this large bit width cannot be positioned correctly inside a 25*18 hard block. I think the reason is the Splitter split inputs at once( just for one time) and the split Multiplicants(inputs) must be lesser or equal to min hard block bit width(18). unless Spliiter would have a recursive method to break down new Multiplicants(inputs) again to get fitted in the hard block (D & Q algorithm).
So it should follow: min hard block bit width > multiplier bit width / 2.
(here, in mcml.v and 25x18 arch, min hard block bit width is 18 and multiplier bit width / 2 is 32. so 35x35 is maximum allowance value for input multiplier )

from vtr-verilog-to-routing.

WhiteNinjaZ avatar WhiteNinjaZ commented on June 30, 2024

I have run the changes through several different hard multiplier sizes with unequal input widths and all the ones I ran worked on diffeq2.v. Nice work! As you mentioned mcml is still broken because of the 64x64multiplier. Looking at parmys's generated netlist it also looks like a few of the multipliers input pins from the split multiplier a/b functions are completely unconnected. I am currently looking into this and will let you know what I find.

from vtr-verilog-to-routing.

amirarjmand93 avatar amirarjmand93 commented on June 30, 2024

Thank you Joshua,
I have some suggestions that may be helpful.
please ignore changes in the Padding function and work on your code. next, test on verilog mul design (<18) and (>25). care about the max boundary(not more than 35). in other words, refuse middle range (18~25) multiplier size. keep the arch file intact(18*25). see the netlist connection status.
( I think the mcml works on baseline arch (k6...) because of mono 36 * 36 and dual of 18 * 18 mul hard blocks and satisfies the mentioned inequality. maybe!)

from vtr-verilog-to-routing.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.