ucb-bar / berkeley-hardfloat Goto Github PK

License: Other

Makefile 0.39% Scala 87.72% C++ 4.32% C 4.02% SystemVerilog 1.99% Verilog 0.61% Nix 0.95%

berkeley-hardfloat's Introduction

Berkeley Hardware Floating-Point Units

This repository contains hardware floating-point units written in Chisel. This library contains parameterized floating-point units for fused multiply-add operations, conversions between integer and floating-point numbers, and conversions between floating-point conversions with different precision.

WARNING: These units are works in progress. They may not be yet completely free of bugs, nor are they fully optimized.

Recoded Format

The floating-point units in this repository work on an internal recoded format (exponent has an additional bit) to handle subnormal numbers more efficiently in a microprocessor. A more detailed explanation will come soon, but in the mean time here are some example mappings for single-precision numbers.

IEEE format                           Recoded format
----------------------------------    -----------------------------------
s 00000000 00000000000000000000000    s 000------ 00000000000000000000000
s 00000000 00000000000000000000001    s 001101011 00000000000000000000000
s 00000000 0000000000000000000001f    s 001101100 f0000000000000000000000
s 00000000 000000000000000000001ff    s 001101101 ff000000000000000000000
    ...              ...                   ...              ... 
s 00000000 001ffffffffffffffffffff    s 001111111 ffffffffffffffffffff000
s 00000000 01fffffffffffffffffffff    s 010000000 fffffffffffffffffffff00
s 00000000 1ffffffffffffffffffffff    s 010000001 ffffffffffffffffffffff0
s 00000001 fffffffffffffffffffffff    s 010000010 fffffffffffffffffffffff
s 00000010 fffffffffffffffffffffff    s 010000011 fffffffffffffffffffffff
    ...              ...                   ...              ... 
s 11111101 fffffffffffffffffffffff    s 101111110 fffffffffffffffffffffff
s 11111110 fffffffffffffffffffffff    s 101111111 fffffffffffffffffffffff
s 11111111 00000000000000000000000    s 110------ -----------------------
s 11111111 fffffffffffffffffffffff    s 111------ fffffffffffffffffffffff

Unit-Testing

To unit-test these floating-point units, you need the berkeley-testfloat-3 package.

To test floating-point units with the C simulator:

$ make

berkeley-hardfloat's People

Contributors

Stargazers

Watchers

Forkers

khkwok palmer-dabbelt stevobailey washfog ivanratkovic zhuzhengchao shawnless timothychong sandy4321 hossein1387 ffk0716 hstogether lu-ping richardxia rolfyyu antonkrug hoangt ashriram shuangchenli zhanggd tdb-alcorn vmurali lvcargnini grebe jushio jmarcum noureddine-as myftptoyman fmrt-owen lizhhui stillwater-sc ed-at-sf shi27feng jtarango crosshairs sifive hooray218 ocakgun yongleeee leahyao sequencer erlouzhuhu ljwljwljwljw phyllisayk wangfeng012316 minghaooo mfkiwl kix6lj arunkmv jamesdunn nguyentrungduong zissi-lei meow-chip weichi-zhang jessicadon95 qshan gonsolo riscv-stc seanpppp ajunlonglive virtualsecureplatform singularitykchen rvcoresjw backyes shenjiangqiu zenithalhourlyrate yummy0929 midnighter95 anniezfy openxiangshan personalopensource nibrunie xuzhongwei fjshen avesus ricardonid juanmaneo rezaasjd pinata-consulting nicolasvanphan amagicman

berkeley-hardfloat's Issues

Question about arbitrary precision support

Hello,

I am studying the FPU module of Rocket Chip SoC Generator, which uses this library. I also see that all the modules here are configurable (using expWidth: Int, sigWidth: Int). Those values are then set in Rocket Chip as (8, 24) for single precision and (11, 53) for double precision.

My 1st question is, is this library valid also for IEEE non-standard arbitrary bit-widths? E.g. expWidth = 6, sigWidth = 18 ?
What about the IEEE half-precision standard?

Thank you!
Best regards,

Mult for Mantissa

I look into MulAddRecFN.scala which is seemed that body of float mult-add (out = A*B + C).
The flow is indeed as follows;
in -> preMul -> postMul -> out

preMul

extract sign, significant, mantissa by rawFloatFromRecFN().
Significant of C alignment
output

postMul

Addition with C (line #206)
Normalize for result
Rounding for result

I could not find multiplication for mantissa part of A and B operands, where actually does it?

How to compile it to verilog!

Hi,
I have installed sbt on my windows machine, and when I run make it would generate these errors,

git submodule update --init berkeley-testfloat-3
fatal: not a git repository (or any of the parent directories): .git
make: *** [Makefile:9: berkeley-testfloat-3/.git] Error 128

So How should I Generate Verilog for implementation of FPGA?

Can you provide a detailed explanation of floating point division? The code is too hard to read.

dependencies issue

sbt.librarymanagement.ResolveException: Error downloading edu.berkeley.cs:hardfloat_2.13:1.5-SNAPSHOT.

but in https://repo1.maven.org/maven2/edu/berkeley/cs/ only found hardfloat_2.12

The algorithm used in DivSqrtRecF64_mulAddZ31 to calculate square-root

What is the algorithm used to calculate square-root in DivSqrtRecF64_mulAddZ31.scala? Dose it use Goldschmidt or Newton-Raphson algorithm, and can you provide some references about the algorithm it used?
Thank you.

How to include in my own projects?

Hello,

I wanted to ask what the best way to include these modules in my own projects would be? Given that the last release on Maven is a few years old, I guess you need to include the whole repository somehow? Thank you!

make error

After I cloned and tried to make this project. It failed. Could anyone help me to fix this issue?
The log shows as:
[info] running hardfloat.FMATest f16FromRecF16 -td test-f16FromRecF16
[error] (run-main-0) java.lang.NoClassDefFoundError: firrtl/options/StageError
[error] java.lang.NoClassDefFoundError: firrtl/options/StageError
[error] at hardfloat.FMATest$.main(tests.scala:62)
[error] at hardfloat.FMATest.main(tests.scala)
[error] erkeley-at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[error] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[error] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[error] at java.lang.reflect.Method.invoke(Method.java:483)
[error] Caused by: java.lang.ClassNotFoundException: firrtl.options.StageError
[error] at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
[error] at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
[error] at java.security.AccessController.doPrivileged(Native Method)
[error] at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
[error] at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
[error] at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
[error] at hardfloat.FMATest$.main(tests.scala:62)
[error] at hardfloat.FMATest.main(tests.scala)
[error] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[error] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[error] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[error] at java.lang.reflect.Method.invoke(Method.java:483)
[error] stack trace is suppressed; run last Compile / bgRun for the full output
[error] Nonzero exit code: 1
[error] (Compile / run) Nonzero exit code: 1
[error] Total time: 1 s, completed Dec 17, 2019 2:05:42 PM
Makefile:324: recipe for target 'test-f16FromRecF16/ValExec_f16FromRecF16.v' failed
make: *** [test-f16FromRecF16/ValExec_f16FromRecF16.v] Error 1

chisel3.Driver.execute is deprecated

As of Chisel 3.2.2, Driver.execute is deprecated and will be removed in 3.4. There appear to be ~100 usages in src/main/scala/tests.scala as shown below:

[warn] /hardfloat/src/main/scala/tests.scala:62:32: method execute in object Driver is deprecated (since 3.2.2): Use chisel3.stage.ChiselStage.execute. This will be removed in 3.4.
[warn]                 chisel3.Driver.execute(testArgs, () => new ValExec_f16FromRecF16)
[warn]                                ^
[warn] /hardfloat/src/main/scala/tests.scala:62:25: object Driver in package chisel3 is deprecated (since 3.2.4): Please switch to chisel3.stage.ChiselStage. Driver will be removed in 3.4.
[warn]                 chisel3.Driver.execute(testArgs, () => new ValExec_f16FromRecF16)
[warn]                         ^
...

in order try it needs fpga toolkit , but where to buy?

Printing value

Hello,
I'd first like to use the opportunity to express gratitude and respect to the author and everyone else working on RISCV ecosystem.
Now, I was investigating the DivSqrtRecFN_small module and was interested in the number of cycles used up - specifically the variable cycleNum. So I wanted to take a closer look so I included a printf line there. It wouldn't compile when I wrote it Scala-style:

printf(p"cycleNum= $cycleNum")

but that part went fine when written in C-style:

printf("cycleNum = %d", cycleNum)

However, the output is nowhere to be seen, even though this method was tried multiple times in other Chisel projects... Can you please help me?
Where did it go?
Or how to print to standard output?
Thanks,
Aleksandar

Is there any 32bit version for DIV/SQRT operation?

I want to implement the 'F' extension of RISC-V ISA based on berkeley-hardfloat, everything is great except I couldn't find a 32bit implementation for DIV/SQRT yet. I'm wondering how to implement the 32bit DIV/SQRT based on this repo.

Potentially incorrect execution with negative zeros

Regarding:

berkeley-hardfloat/src/main/scala/MulAddRecFN.scala

Line 285 in 9deaf1d

(notNaN_addZeros && ! roundingMode_min &&

Problem:
In cases where FMA unit is used for pure multiplication (by basically having C=+0 in A*B+C of FMA), sometimes, the sign of the output (against intention) depends on the sign of C (it shouldn't because the intention is to only do multiplication, and not FMA).

Proposed solution:
Have a wire input into MulAddRecFN to indicate if an operation is pure multiplication, in which case, the line linked above needs to be changed to propagate only sign of (A*B), and not consider the influence of sign of C.

If you think this is a decent fix, I can do a PR, otherwise, if you expect this condition needs to be handled externally (by a pipeline), then, I can put in a comment or update the doc somewhere.

Do you plan to upload a document of the implementation details of this FPU?

Sorry for this demand, but the chisel3 source codes are so hard to read. I want to rewrite your FPU implementation in verilog/systemverilog for my project. But I find it is really hard to comprehend your FPU algorithm.

Much thanks!

Recoded format

How dose floating-point Recoded format mapping IEEE format? Is there any document to explain？

sbt test fail

    at scala.sys.process.ProcessImpl$PipeThread.runloop(ProcessImpl.scala:170)
    at scala.sys.process.ProcessImpl$PipeSource.run(ProcessImpl.scala:188)

I/O error Pipe closed for process: [testfloat_gen, -rnear_maxMag, -tininessbefore, f16_add]
java.io.IOException: Pipe closed
at java.base/java.io.PipedInputStream.checkStateForReceive(PipedInputStream.java:260)
at java.base/java.io.PipedInputStream.awaitSpace(PipedInputStream.java:268)
at java.base/java.io.PipedInputStream.receive(PipedInputStream.java:231)
at java.base/java.io.PipedOutputStream.write(PipedOutputStream.java:149)
at scala.sys.process.BasicIO$.loop$1(BasicIO.scala:240)
at scala.sys.process.BasicIO$.transferFullyImpl(BasicIO.scala:246)
at scala.sys.process.BasicIO$.transferFully(BasicIO.scala:227)
at scala.sys.process.ProcessImpl$PipeThread.runloop(ProcessImpl.scala:170)
at scala.sys.process.ProcessImpl$PipeSource.run(ProcessImpl.scala:188)
I/O error Pipe closed for process: [testfloat_gen, -rminMag, -tininessbefore, -level2, f16_to_f32]
java.io.IOException: Pipe closed
at java.base/java.io.PipedInputStream.checkStateForReceive(PipedInputStream.java:260)
at java.base/java.io.PipedInputStream.awaitSpace(PipedInputStream.java:268)
at java.base/java.io.PipedInputStream.receive(PipedInputStream.java:231)

Recode format

How does floating point number recode in detail? Is there any document or reference or why it needs to be recoded?

Division/sqrt algorithm

Which algorithm does hardfloat use for division and square root?

compilation error

Hi,
Does anyone know how to solve the error below? Thanks a lot!

[info] Reapplying settings...
[info] Set current project to chaos (in build file:/C:/Workspace/.../)
[IJ]sbt:project> compile
[info] Compiling 40 Scala sources to C:\Workspace...\scala-2.12\classes ...
[error] C:\Workspace...\src\main\scala\hardfload\primitives.scala:92:52: value asBools is not a member of Chisel.UInt
[error] def apply(in: UInt): UInt = PriorityEncoder(in.asBools.reverse)
[error] ^
[error] one error found
[error] (Compile / compileIncremental) Compilation failed
[error] Total time: 10 s, completed Mar 30, 2020, 6:00:00 PM
[IJ]sbt:project>

FSqrt returns two valid responses for one input.

TL;DR: I am sending one fsqrt request to hardfloat.DivSqrtRecF64. I get back one response the cycle after and then a second response back ~20 cycles later.

Waveform below.

I am using commit (f38b8be) which is what rocket-chip master uses (as well as the boom-devel branch of rocket-chip).

Questions:

Am I using DivSqrtRecF64 incorrectly?
Is this fixed from more recent refactorings of DivSqrtRecF64?

Where is hardfloat code

I could not find hardfloat code in there.
Could you please point out where is the code set?

how to use DivSqrtRecFN_small.scala

I want to perform division and calculate the square root for floating point numbers using the above class. Please can someone help me with the documentation? I am using chisel3. Thank you.

MulAddRecFN infinite loops in Chisel elaboration if called with the wrong inputs (expWidth, sigWidth)

There needs to be some enforcement/requirements on the inputs into the hardfloat units (e.g., MulAddRecFN) to catch infinite loops during Chisel elaboration before they occur.

In my particular use-case, the order of the input parameters expWidth and sigWidth had swapped, so I was unknowingly setting expWidth = 52, sigWidth = 12. I am directly instantiating rocket.FPUFMAPipe, which also swapped these parameters, so I never noticed the changes made to hardfloat (the changes to the recoding units was less problematic, since the function name also changed).

I believe the particular infinite loop occurs within MulAddRecFN_postMul, but it's not immediately obvious how.

Is Recoded Format more hardware-friendly for denormal numbers than IEEE754 format?

I'm trying to design my own FPU but I find that supporting denormal numbers would add huge delay to the hardware logic. So I'm wondering is recoded format able to reduce some delay when dealing with denormal numbers ? Since I find that the biggest difference between recoded format and ieee754 is in denormal numbers. Could you please share some information, like the synthesis results ? Thanks!

verilog simulator

It would be good to make Verilog testing independent from commercial tools - Synopsys VCS. A potential solution would be to use verilator instead.

"Recoded Format" table in README.md is incorrect

The Recoded Format table shows the mantissas as hex digits, with "f", but should be binary with only 0 or 1 (should be 23 bits, not 23 hex digits).

Also, it would be nice to show "maximum subnormal" (IEEE s 00000000 11111111111111111111111) and "minimum normal" (IEEE s 00000001 00000000000000000000000) in the table. The other interesting cases (zero, minimum subnormal, maximum normal, infinity, NaN) are shown.

Why RawModule instead of Module?

When I compile riscv-boom with the latest hardfloat I have the following problem:

[error] (run-main-0) java.lang.reflect.InvocationTargetException
[error] java.lang.reflect.InvocationTargetException
[error] at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
[error] at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
[error] at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
[error] at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
[error] at freechips.rocketchip.util.HasGeneratorUtilities.$anonfun$elaborate$1(GeneratorUtils.scala:55)
[error] at chisel3.Module$.do_apply(Module.scala:52)
[error] at chisel3.Driver$.$anonfun$elaborate$1(Driver.scala:93)
[error] at chisel3.internal.Builder$.$anonfun$build$2(Builder.scala:406)
[error] at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
[error] at chisel3.internal.Builder$.$anonfun$build$1(Builder.scala:404)
[error] at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
[error] at chisel3.internal.Builder$.build(Builder.scala:404)
[error] at chisel3.Driver$.elaborate(Driver.scala:93)
[error] hipyard at freechips.rocketchip.util.HasGeneratorUtilities.elaborate(GeneratorUtils.scala:60)
[error] Caused by: chisel3.internal.ChiselException: Error: No implicit clock.
[error] at chisel3.internal.throwException$.apply(Error.scala:85)
[error] at chisel3.internal.Builder$.$anonfun$forcedClock$1(Builder.scala:319)
[error] at scala.Option.getOrElse(Option.scala:189)
[error] at chisel3.internal.Builder$.forcedClock(Builder.scala:319)
[error] at chisel3.RegInit$.apply(Reg.scala:174)
[error] at chisel3.RegInit$.apply(Reg.scala:192)
[error] at Chisel.package$Reg$.apply(compatibility.scala:378)
[error] at hardfloat.DivSqrtRecF64ToRaw_mulAddZ31.(DivSqrtRecF64_mulAddZ31.scala:85)
[error] at hardfloat.DivSqrtRecF64_mulAddZ31.$anonfun$divSqrtRecF64ToRaw$1(DivSqrtRecF64_mulAddZ31.scala:750)
[error] at chisel3.Module$.do_apply(Module.scala:52)
[error] at hardfloat.DivSqrtRecF64_mulAddZ31.(DivSqrtRecF64_mulAddZ31.scala:750)
[error] at hardfloat.DivSqrtRecF64.$anonfun$ds$1(DivSqrtRecF64.scala:60)
[error] at chisel3.Module$.do_apply(Module.scala:52)
[error] at hardfloat.DivSqrtRecF64.(DivSqrtRecF64.scala:60)
[error] Nonzero exit code: 1

Please explain why the last correction is needed and give advice on how and where it is better to fix it.