Comments (6)
Hi,
Did you tired with the vanilla GenFullNoMmuMaxPerf config ?
in which test environnement did you run you version ?
from vexriscv.
Hi,
After removing TCM and restore cache size back to 8 KB for each I and D bus.
/home/datakey/tools/riscv64-unknown-elf-gcc-2018/bin/riscv64-unknown-elf-gcc -fno-inline -fno-common -O3 -DPREALLOCATE=1 -DHOST_DEBUG=0 -DMSC_CLOCK -march=rv32im -mabi=ilp32 -g -O3 -fno-inline -MD -fstrict-volatile-bitfields -o build/dhrystone.elf build/src/main.o build/src/dhry_1.o build/src/dhry_2.o build/src/crt.o build/src/stdlib.o -lc -lc -march=rv32im -mabi=ilp32 -nostdlib -lgcc -mcmodel=medany -nostartfiles -ffreestanding -Wl,-Bstatic,-T,../libs/linkerAllInSramForSim.ld,-Map,build/dhrystone.map,--print-memory-usage
Memory region Used Size Region Size %age Used
onChipRam: 26992 B 32 KB 82.37%
sdram: 0 GB 64 MB 0.00%
After downloading bitstream to FPGA and run the program in release mode.
The result is showing below:
Dhrystone Benchmark, Version 2.1 (Language: C)
Program compiled without 'register' attribute
Please give the number of runs through the benchmark:
Execution starts, 500 runs through Dhrystone
Execution ends
Final values of the variables used in the benchmark:
Int_Glob: 5
should be: 5
Bool_Glob: 1
should be: 1
Ch_1_Glob: A
should be: A
Ch_2_Glob: B
should be: B
Arr_1_Glob[8]: 7
should be: 7
Arr_2_Glob[8][7]: 510
should be: Number_Of_Runs + 10
Ptr_Glob->
Ptr_Comp: -2147459732
should be: (implementation-dependent)
Discr: 0
should be: 0
Enum_Comp: 2
should be: 2
Int_Comp: 17
should be: 17
Str_Comp: DHRYSTONE PROGRAM, SOME STRING
should be: DHRYSTONE PROGRAM, SOME STRING
Next_Ptr_Glob->
Ptr_Comp: -2147459732
should be: (implementation-dependent), same as above
Discr: 0
should be: 0
Enum_Comp: 1
should be: 1
Int_Comp: 18
should be: 18
Str_Comp: DHRYSTONE PROGRAM, SOME STRING
should be: DHRYSTONE PROGRAM, SOME STRING
Int_1_Loc: 5
should be: 5
Int_2_Loc: 13
should be: 13
Int_3_Loc: 7
should be: 7
Enum_Loc: 1
should be: 1
Str_1_Loc: DHRYSTONE PROGRAM, 1'ST STRING
should be: DHRYSTONE PROGRAM, 1'ST STRING
Str_2_Loc: DHRYSTONE PROGRAM, 2'ND STRING
should be: DHRYSTONE PROGRAM, 2'ND STRING
Clock cycles=213512
DMIPS per Mhz: 1.33
The bench result is 1.33DMIPS/Mhz.
This result is better than TCM but not make sense.
Do you have any idea to help me verify it?
Thanks.
from vexriscv.
Hi,
I looked at the code, and i think i found the reason why :
Basicaly, the data cache has the advantage that the write are delayed until writeback stage, while the thigly coupled dbus has the penality that write are scheduled early (execute stage) and should ensure that there is no risk of them being unscheduled by a branch or an exception or anything else.
So thigly coupled dbus will sometime have to wait for the pipeline to empty itself (when doing store)
from vexriscv.
Hi,
Thanks for the reply.
I got it.
from vexriscv.
Hi, @Dolu1990
May I ask one more question?
First, change the configuration for DivPlugin,
//new DivPlugin,
new MulDivIterativePlugin(genMul = false, genDiv = true, mulUnrollFactor = 1, divUnrollFactor = 2, dhrystoneOpt=true),
The bench will be improved like following
1.33DMIPS(8KB Cache IBUS, 8KB Cache DBUS) ->
1.38DMIPS(8KB Cache IBUS, 8KB Cache DBUS, divUnrollFactor = 2)->
1.44DMIPS(8KB Cache IBUS, 8KB Cache DBUS, divUnrollFactor = 2, dhrystoneOpt=true)
When setting dhrystoneOpt=true, is it really helpful to improve in real operation?
Second, when I set genMul = true and mulUnrollFactor=2 to replace MulPlugin,
//new MulPlugin,
//new DivPlugin,
new MulDivIterativePlugin(genMul = true, genDiv = true, mulUnrollFactor = 2, divUnrollFactor = 2, dhrystoneOpt=true),
The bench test is decrease to 1.33MIPS.
Although using genMul = true in MulDivIterativePlugin can replace MulPlugin,
But performance is lower than MulPlugin.
Is it right?
Thanks
from vexriscv.
When setting dhrystoneOpt=true, is it really helpful to improve in real operation?
I would say, not realy usefull, as it only work for very small division numbers
But performance is lower than MulPlugin.
yes, at least in practice for FPGA
from vexriscv.
Related Issues (20)
- Fetch dosen't performed correctly in the simulation of Murax SOC.(+Custom instructions are executed in unexpected time.) HOT 1
- Instructions to save/restore register to stack is taking 2 clock each HOT 12
- DE0-Nano Board with VexRiscV: IO and Fit Design Issues Including Specific Command Used HOT 3
- Adding VexRiscV as a dependency HOT 2
- Data Stream in/out SoC <-> FPGA HOT 6
- FPU plugin to GenFull.scala HOT 3
- EU Funding HOT 3
- Compile C code and run bare metal cycle accurate simulation HOT 3
- Debug instructions executed twice HOT 5
- Exit cycle accurate simulation HOT 1
- Problems with adding FPU in Briey HOT 5
- Problem about how to compile the software that can be used in Vexriscv with FPU HOT 10
- How to use printf function? HOT 10
- About the Csr registers in Vexriscv HOT 2
- How to only modify certain one reset kind of specific Reg in vex core. HOT 1
- How to only modify certain one reset kind of specific Reg in vex core.
- AxiCrossBar with Standard Axi4 Interface in Briey HOT 15
- VexRiscV shift bus fail HOT 3
- rdcycle and rdinstret instructions not working HOT 2
- default bus doesn't expose write mask
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vexriscv.