GithubHelp home page GithubHelp logo

uwsampa / cse548-labs Goto Github PK

View Code? Open in Web Editor NEW
40.0 40.0 31.0 1.08 MB

A repository containing homework labs for CSE548

License: MIT License

Makefile 0.43% Tcl 68.89% C 1.99% C++ 10.79% Jupyter Notebook 12.53% Python 5.38%

cse548-labs's People

Contributors

tmoreau89 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cse548-labs's Issues

[Part 1] Part ` xc7k160tfbg484-2`, chapter 2 needed to complete linked tutorial

I think (and I could have been confused about this because it was the beginning of the assignment) that the installation instructions should have us install an additional part in order to follow the recommended tutorial (chapters 6 and 7 of the Xilinx guide).

When attempting to play along with chapters 6 or 7, the project won't run. The error messages that come up reference something about a "part" not being set, which is confusing when you're just starting out. When you google for the error message, it's a bunch of forum posts with people telling people to run commands in a "tcl console," which is also confusing (if you're in the GUI and don't know what "tcl" is).

What I ended up realizing is that when you open up the tutorial files, you need to specify which part (which also I think is called an "IP" = "intellectual property" ... and I just think of as "FPGA board") you're going to use as a target to build the code. (I guess it's analogous to picking your compiler target between x86 and ARM when compiling "normal" code.)

This is covered in the Xilinx guide in chapter 2 (specifically, page 15).

However, the part they end up picking ( xc7k160tfbg484-2) isn't one that we have installed (I think). So, instead, I picked a part that seemed to be very similar to the one they recommend. However, it's different enough that when you run through chapters 6 and 7, all of the numbers and diagrams are off from what they show in the guide. This makes following through and doing the analysis a bit tricky (especially if everything is totally foreign and you're still trying to get a bearing of what the heck is going on).

So, if installing that part for free is possible, it might be good for future iterations of the lab to have that part installed and point them to chapter 2 to get it setup.

(Apologies if I'm crazy here / feel free to shame me in the comments.)

[Part 1 A] Initial (unmodified) report different than expected

After going into zynq/hls/mmult_float and running vivado_hls -f hls.tcl, the numbers in my report, without changing anything, are different from expected (what's given in the assignment). I ran multiple times. It does seem to be using the correct device (xc7z020clg484-1). Given we're doing optimizations, does this difference matter?

It looks (from my naive investigation) like the L3 inner loop has 11 instead of 10 iteration latency, which bumps up the overall latency by 10%.

Here's what I get:

Latency

expected:

+--------+--------+--------+--------+---------+
|     Latency     |     Interval    | Pipeline|
|   min  |   max  |   min  |   max  |   Type  |
+--------+--------+--------+--------+---------+
|  209851|  209851|  209852|  209852|   none  |
+--------+--------+--------+--------+---------+

mine (about 10% slower):

    +--------+--------+--------+--------+---------+
    |     Latency     |     Interval    | Pipeline|
    |   min  |   max  |   min  |   max  |   Type  |
    +--------+--------+--------+--------+---------+
    |  230331|  230331|  230332|  230332|   none  |
    +--------+--------+--------+--------+---------+

Utilization

expected:

+-----------------+---------+-------+--------+-------+
|       Name      | BRAM_18K| DSP48E|   FF   |  LUT  |
+-----------------+---------+-------+--------+-------+
|DSP              |        -|      -|       -|      -|
|Expression       |        -|      -|       0|    308|
|FIFO             |        -|      -|       -|      -|
|Instance         |        0|      5|     384|    751|
|Memory           |       16|      -|       0|      0|
|Multiplexer      |        -|      -|       -|    381|
|Register         |        -|      -|     714|      -|
+-----------------+---------+-------+--------+-------+
|Total            |       16|      5|    1098|   1440|
+-----------------+---------+-------+--------+-------+
|Available        |      280|    220|  106400|  53200|
+-----------------+---------+-------+--------+-------+
|Utilization (%)  |        5|      2|       1|      2|
+-----------------+---------+-------+--------+-------+

mine (FF/LUT higher):

+-----------------+---------+-------+--------+-------+
|       Name      | BRAM_18K| DSP48E|   FF   |  LUT  |
+-----------------+---------+-------+--------+-------+
|DSP              |        -|      -|       -|      -|
|Expression       |        -|      -|       0|    537|
|FIFO             |        -|      -|       -|      -|
|Instance         |        0|      5|     384|    751|
|Memory           |       16|      -|       0|      0|
|Multiplexer      |        -|      -|       -|    558|
|Register         |        -|      -|     779|      -|
+-----------------+---------+-------+--------+-------+
|Total            |       16|      5|    1163|   1846|
+-----------------+---------+-------+--------+-------+
|Available        |      280|    220|  106400|  53200|
+-----------------+---------+-------+--------+-------+
|Utilization (%)  |        5|      2|       1|      3|
+-----------------+---------+-------+--------+-------+

Loop performance

expected:

+--------------+--------+--------+----------+-----------+-----------+------+----------+
|              |     Latency     | Iteration|  Initiation Interval  | Trip |          |
|   Loop Name  |   min  |   max  |  Latency |  achieved |   target  | Count| Pipelined|
+--------------+--------+--------+----------+-----------+-----------+------+----------+
|- LOAD_OFF_1  |      10|      10|         2|          -|          -|     5|    no    |
|- LOAD_W_1    |    2580|    2580|       258|          -|          -|    10|    no    |
| + LOAD_W_2   |     256|     256|         2|          -|          -|   128|    no    |
|- LOAD_I_1    |    2064|    2064|       258|          -|          -|     8|    no    |
| + LOAD_I_2   |     256|     256|         2|          -|          -|   128|    no    |
|- L1          |  205056|  205056|     25632|          -|          -|     8|    no    |
| + L2         |   25630|   25630|      2563|          -|          -|    10|    no    |
|  ++ L3       |    2560|    2560|        10|          -|          -|   256|    no    |
|- STORE_O_1   |     136|     136|        17|          -|          -|     8|    no    |
| + STORE_O_2  |      15|      15|         3|          -|          -|     5|    no    |
+--------------+--------+--------+----------+-----------+-----------+------+----------+

mine (L1/L2/L3 are slower---might all be stemming from L3 having 11 instead of 10 iteration latency?):

        +--------------+--------+--------+----------+-----------+-----------+------+----------+
        |              |     Latency     | Iteration|  Initiation Interval  | Trip |          |
        |   Loop Name  |   min  |   max  |  Latency |  achieved |   target  | Count| Pipelined|
        +--------------+--------+--------+----------+-----------+-----------+------+----------+
        |- LOAD_OFF_1  |      10|      10|         2|          -|          -|     5|    no    |
        |- LOAD_W_1    |    2580|    2580|       258|          -|          -|    10|    no    |
        | + LOAD_W_2   |     256|     256|         2|          -|          -|   128|    no    |
        |- LOAD_I_1    |    2064|    2064|       258|          -|          -|     8|    no    |
        | + LOAD_I_2   |     256|     256|         2|          -|          -|   128|    no    |
        |- L1          |  225536|  225536|     28192|          -|          -|     8|    no    |
        | + L2         |   28190|   28190|      2819|          -|          -|    10|    no    |
        |  ++ L3       |    2816|    2816|        11|          -|          -|   256|    no    |
        |- STORE_O_1   |     136|     136|        17|          -|          -|     8|    no    |
        | + STORE_O_2  |      15|      15|         3|          -|          -|     5|    no    |
        +--------------+--------+--------+----------+-----------+-----------+------+----------+

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.