Comments (345)
I have the userspace segfault issue seemingly fixed!
The problem was that the mapping code in the kernel was always mapping pages as RWX. But the kernel relies on pages being mapped read-only and triggering a fault on writes (e.g. for copy-on-write optimisations). Fixing that, and hacking the DBusCached plugin so that all write faults trigger a store page fault exception (the store fault exception was going to M-mode and causing problems, need to look into correct behaviour here), seems to result in a reliable userspace.
from vexriscv.
liteeth is working too! Although the combination of lack of caching and expensive context switches means this takes the best part of a minute...
from vexriscv.
Hardware refilled MMU + cacheless iBus dBus plugin design done, i will test it and keep you updated as soon there is something stable enough
from vexriscv.
Tested for 8, 12, 16, 32 bits PLIC with close to minimal features.
It take about 10 LC per interrupt. I estimate the cost of the ExternalInterruptArrayPlugin about 4 LC per interrupt
PlicBench8 ->
iCE40 -> 110 Mhz 81 LC
PlicBench12 ->
iCE40 -> 100 Mhz 118 LC
PlicBench16 ->
iCE40 -> 92 Mhz 156 LC
PlicBench32 ->
iCE40 -> 70 Mhz 272 LC
But with all feature enabled, it is about 16 LC by interrupt XD
I have to say, that the implementation i used was made to be fast on multi bits priority configs. Not to be small on single bits priority. So probably it can be smaller.
https://github.com/SpinalHDL/VexRiscv/blob/linux/src/test/scala/vexriscv/experimental/PlicCost.scala
from vexriscv.
@roman3017 Yes, it look right for me, the cpu0.yaml should be the one generated by the SpinalHDL generation.
Then, it will run this simulation workspace :
https://github.com/SpinalHDL/VexRiscv/blob/master/src/test/cpp/regression/main.cpp#L2697
But by default the golden model is disabled. and i don't think you can use it in the same time than the JTAG, it will think the CPU is doing crazy things.
So, what is required :
- calling w.withRiscvRef() before the w.run
- loading the required binaries, a bit like https://github.com/SpinalHDL/VexRiscv/blob/master/src/test/cpp/regression/main.cpp#L2700 do
- Completting the golden model to support the implemented features (MMU, CSR, supervisor timer)
@enjoy-digital @roman3017 @kgugala @daveshah1
I think we need to consolidate things, trying to have a platform independent minimal port / having robust foundation :D
So there is the things i'm considering for the hardware side :
- Implementing another MMU as the RISC-V spec require, as a self refilled one, and solving the memory coherency that it would have against the data cache by sharing the datacache between the CPU load/store and the MMU. I have to inverstigate a bit, but finaly it look like it would be without noticeable synthesis Area/Speed compromises.
- Reworking the cache to be a write-through one, removing the write buffer, and also removing the write-to-read bypass muxes to reduce area for a minimal IPC impact and probably a better FMax, supporting multi way data cache
Refilling the MMU from the data cache would greatly reduce the need of internal TLB caches, which is quite heavy on FPGA, and also improve performance quite much.
So quite a bit of redesign, but i thing it worth it. Do it sound sensible ? Would that compromise your progress much ?
from vexriscv.
Yes, I think an M-mode trap handler is the proper solution. We can probably use it to deal with any missing atomic instructions too.
from vexriscv.
The Linux repo is: https://github.com/daveshah1/litex-linux-riscv
To build:
cp litex_default_configuration .config
ARCH=riscv CROSS_COMPILE=riscv32-unknown-linux-gnu- make -j`nproc`
riscv32-unknown-linux-gnu-objcopy -O binary vmlinux vmlinux.bin
Defconfig: https://github.com/daveshah1/litex-linux-riscv/blob/master/litex_default_configuration
Device tree: https://github.com/daveshah1/litex-linux-riscv/blob/master/rv32.dts
The arch code is in https://github.com/daveshah1/litex-linux-riscv/tree/master/arch/riscv
In particular:
- TLB reloads: https://github.com/daveshah1/litex-linux-riscv/blob/master/arch/riscv/include/asm/pgtable.h#L276
- TLB flush: https://github.com/daveshah1/litex-linux-riscv/blob/master/arch/riscv/include/asm/tlbflush.h
- Fault handler: https://github.com/daveshah1/litex-linux-riscv/blob/master/arch/riscv/mm/fault.c
- Startup & vectors: https://github.com/daveshah1/litex-linux-riscv/blob/master/arch/riscv/kernel/entry.S
from vexriscv.
@roman3017 So, i have to say, i merged and adapted changes made by daveshah1 and kgugala by hands, as the head of the repo was quite different, things will not work out of the box, anyway they removed many bugs/spec missmatches :)
@daveshah1 and @kgugala About the Linux requirements. Which part of the RISC-V Atomic extension is used ? Only LR and SC ? Is that right ?
from vexriscv.
I would personally prefer a RISC-V compatible interrupt controller rather than the one we have already, if it isn't too large or complex.
A small stub will be needed in M-mode to deal with SBI (e.g. setiing timers), forwarding timer interrupts, atomic emulation, etc. Perhaps this could be part of the LiteX BIOS.
from vexriscv.
About atomics, there is some support in VexRiscv to provide LR/SC in a local way, it only work for single CPU systems.
from vexriscv.
Yeah, "dummy" implementations that work on single CPU systems should be perfectly fine.
from vexriscv.
As discussed at Free Silicon Conference together with @Dolu1990 , we are also working on it here:
enjoy-digital/litex#134.
We can continue the discussion here for the CPU aspect. @daveshah1: i saw you made some progress,
just for info @Dolu1990 is ok to help getting things working. So it you see strange things or need help on things related to Spinal/Vexriscv, you can discuss your findings here.
from vexriscv.
My current status is that I have made quite a few hacks to the kernel, vexriscv and LiteX, but I'm still only just getting into userspace and not anywhere useful yet.
VexRiscv: https://github.com/daveshah1/VexRiscv/tree/Supervisor
Build config: https://github.com/daveshah1/VexRiscv-verilog/tree/linux
LiteX: https://github.com/daveshah1/litex/tree/vexriscv-linux
kernel: https://github.com/daveshah1/litex-linux-riscv
@Dolu1990 I would be interested if you could look at 818f1f6 - loads were always reading 0xffffffff from virtual memory addresses when bit 10 of the offset (0x400) was set. This seems to fix it, but I'm not sure if a better fix is possible
As it stands, the current issue is a kernel panic "Oops - environment call from S-mode" shortly after init
starts. It seems after a few syscalls it either isn't returning properly to userspace, or a spurious ECALL is accidently triggered while in S-mode (it might be the ECALL getting "stuck" somewhere and lurking, so what should be an IRQ triggers the ECALL instead)
from vexriscv.
Hi @daveshah1 @enjoy-digital :D
So, for sure we will hit bugs in VexRiscv, as only the machine mode was properly tested.
Things not tested enough in VexRiscv which could have bugs :
- Supervisor / User mode
- MMU
I think the best would be to setup a minimal test environnement to run linux on. It would save us a lot of time and sanity. Especialy for a linux port project :D
So, to distinguish hardware bugs from software bugs my proposal is that i setup a minimalistic environnement where only the VexRiscv CPU is simulated and compared against a instruction syncronised software model of the CPU (I already have one which do that, but CSR are missing from it)
This would point exactly when the hardware is diverging from what it should do, and bring serenity in the developpement ^.^
Does that sound good for you ?
from vexriscv.
That sounds very sensible! The minimal peripheral requirement is low, just a timer (right now I have the LiteX timer connected to the timerInterruptS pin, and hacked the kernel to directly talk to that rather than the proper SBI route to setting up a timer) and a UART of some kind.
My only concern with this is speed, right now it is taking about 30s on hardware at 75MHz to get to the point of failure. So definitely want to use Verilator and not iverilog...
from vexriscv.
I can setup easily a verilator simulation. But 30s on hardware at 75MHz will still be a bit slow: we can expect 1MHz execution speed so that's still around 40 min...
from vexriscv.
I did just manage to make a bit of progress on hardware (perhaps this talk of simulators is scaring it into behaviour π)
It does reach userspace successfully, so we can almost say Linux is working. If I set /bin/sh as init, then I can even use shell builtins - being able to run echo hello world
counts as Linux, right? (but calls to other programs don't seem to work). init itself is segfaulting deep within libc, so there's still something fishy, but could just be a dodgy rootfs.
from vexriscv.
@daveshah1 this is great. The libc segfault happened also in our REnode (https://github.com/renode/renode) emulation. Can you share the rootfs you're using?
from vexriscv.
This is the initramdisk from antmicro/litex-linux-readme with a small change to inittab to remove some references to files that don't exist
In terms of other outstanding issues, I also had to patch VexRiscv so that interrupts are routed to S-mode rather than M-mode. This broke the LiteX BIOS which expects M-mode interrupts, so I had to patch that to not expect interrupts at all, but that means there is now no useful UART output from the BIOS. I think a proper solution would be to select interrupt privilege dynamically somehow.
from vexriscv.
We had to fix/workaround irq delegates. I think this code should be in our repo, but I'll check that again.
from vexriscv.
The segfault I see is:
[ 53.060000] getty[45]: unhandled signal 11 code 0x1 at 0x00000004 in libc-2.26.so[5016f000+148000]
[ 53.070000] CPU: 0 PID: 45 Comm: getty Not tainted 4.19.0-rc4-gb367bd23-dirty #105
[ 53.080000] sepc: 501e2730 ra : 501e2e1c sp : 9f9b2c60
[ 53.080000] gp : 00120800 tp : 500223a0 t0 : 5001e960
[ 53.090000] t1 : 00000000 t2 : ffffffff s0 : 00000000
[ 53.090000] s1 : 00000000 a0 : 00000000 a1 : 502ba624
[ 53.100000] a2 : 00000000 a3 : 00000000 a4 : 000003ef
[ 53.100000] a5 : 00000160 a6 : 00000000 a7 : 0000270f
[ 53.110000] s2 : 502ba5f4 s3 : 00000000 s4 : 00000150
[ 53.110000] s5 : 00000014 s6 : 502ba628 s7 : 502bb714
[ 53.120000] s8 : 00000020 s9 : 00000000 s10: 000003ef
[ 53.120000] s11: 00000000 t3 : 00000008 t4 : 00000000
[ 53.130000] t5 : 00000000 t6 : 502ba090
[ 53.130000] sstatus: 00000020 sbadaddr: 00000004 scause: 0000000d
The bad address (0x73730 in libc-2.26.so) seems to be in _IO_str_seekoff
, the disassembly around it is:
73700: 00080c93 mv s9,a6
73704: 00048a13 mv s4,s1
73708: 000e0c13 mv s8,t3
7370c: 000d8993 mv s3,s11
73710: 010a0793 addi a5,s4,16
73714: 00000d93 li s11,0
73718: 00000e93 li t4,0
7371c: 00800e13 li t3,8
73720: 3ef00d13 li s10,1007
73724: 02f12223 sw a5,36(sp)
73728: 04092483 lw s1,64(s2)
7372c: 71648463 beq s1,s6,73e34 <_IO_str_seekoff@@GLIBC_2.26+0x41bc>
73730: 0044a783 lw a5,4(s1)
from vexriscv.
I checked the code, and it looks like all has been pushed to github.
As for the segfault: Note that we had to re implement the mapping code in Linux + there are some hacks in the Vex MMU itself. This could be reason of the segfault as user space starts using the virtual memory very extensively.
For example the whole kernel memory space is mapped directly and we bypass the MMU translation maps see:
https://github.com/antmicro/VexRiscv/blob/97d04a5243bbfee9d1dfe56857f3490da9fe1091/src/main/scala/vexriscv/plugin/MemoryTranslatorPlugin.scala#L116
the kernel range is defined in MMU plugin instance: https://github.com/antmicro/VexRiscv/blob/97d04a5243bbfee9d1dfe56857f3490da9fe1091/src/main/scala/vexriscv/TestsWorkspace.scala#L98
I'm pretty sure there are many bugs hidden there :)
from vexriscv.
Ok, I will think about the best way and how exactly setup that test environnement with the syncronised software golden model (to get max speed).
About the golden model, i will complet it (MMU part). But then about the CSR i can do it too, but probably the best would be that somebody else than me cross check my interpretation of the privileged spec, because if both the hardware and the software golden model implement the same wrong interpretation, that's not so helpfull ^^.
from vexriscv.
@enjoy-digital
Maybe we can keep the actual regression test environnement of VexRiscv, and just complet it with the required stuff.
It's a bit dirty, but it should be fine.
https://github.com/SpinalHDL/VexRiscv/blob/master/src/test/cpp/regression/main.cpp
The golden model is currently there
https://github.com/SpinalHDL/VexRiscv/blob/master/src/test/cpp/regression/main.cpp#L193
from vexriscv.
@Dolu1990: in fact i already have the verilator simulation that is working fine, just need improve it a little bit load more easily the vmlinux.bin/vmlinux.dtb and initramdisk to ram. But yes, we'll use what it more convenient for you. I'll look at the your regression env and your golden model.
from vexriscv.
@enjoy-digital Can you show me the verilator testbench sources :D ?
from vexriscv.
@kgugala Which CPU configuration are you using, can you show me ? (The test workspace you pointer isn't using caches nor MMU)
from vexriscv.
The config I am using is at https://github.com/daveshah1/VexRiscv-verilog/blob/linux/src/main/scala/vexriscv/GenCoreDefault.scala (which has a few small tweaks compared to @kgugala's, to skip over FENCEs for example).
from vexriscv.
@enjoy-digital The checks between the golden model and the RTL are :
- Register file writes
- Peripheral accesses
- Some liveness checks
It should be enough to find out divegences fast.
@daveshah1 Jumping over Fence instruction is probably fine for the moment. But jumping over iFence instruction isn't. There is no cache coherency between the instruction cache and the data cache.
Need to use the caches fluch :) Is that used by some ways ?
from vexriscv.
(Memory coherency issues is something which is automaticaly catched by the golden model / RTL cross checkes)
from vexriscv.
As it stands it looks like all the memory has been set up as IO, which I suspect means the L1 caches won't be used at all - I think LiteX provides a single L2 cache.
Indeed, to get useful performance proper use of caches and cache flushes will be needed.
from vexriscv.
yes, we disabled the caches as they were causing a lot of troubles. It didn't make sense to fight both MMU and caches at the same time
from vexriscv.
@daveshah1 Ok ^^ One thing to know, is the instruction cache do not support IO instruction fetch, instead it cache them. (Supporting IO instruction fetch cost area, and isn't realy a usefull think, as far i know ?)
So you still need to flush the instruction cache in iFence. It could be done easily.
@kgugala The cacheless plugins aren't aware about the MMU.
I perfectly understand your point about avoiding the trouble of both at once. So my proposal, is :
- I port MMU support to cacheless instruction and data plugins
- We test things on that cacheless configuration
- Later when things are stable enough, we can introduce caches stuff via a proper machine mode ifence emulation
To the roadmap would be :
- To port MMU support into cacheless plugins
- Implement the cross checked test environnement
- Test and fix stuff until it is stable enough
- Introduce the caches in the loop with proper machine mode emulation
from vexriscv.
TBH the real long term solution will be to reimplement the MMU so it is fully compliant with the spec. Then we can get rid of the custom mapping code in Linux and restore the original mainline memory mapping code used for RV64.
I'm aware this will require quite significant amount of work in Vex itself.
from vexriscv.
I don't think it would require that much work. MMU is a relatively easy piece of hardware.
I have to think about he heavyness in term of FPGA area of a fully compliant MMU.
But what is the issue of a software refilled MMU ? If it use the machine mode to do it, it became transparent to the linux kernel right ? So no linux kernel modification required, but just a piece of machine mode code to have in addition of the raw Linux port :) ?
from vexriscv.
(troll on)
We should not forget the ultimate goal : RISC-V linux on ice40 1K, i'm sure #28 would agree ^.^
(troll off)
from vexriscv.
It just may be difficult to push the custom mapping code to Linux' mainline
from vexriscv.
The trap handler need not sit in Linux at all, it can be part of the bootloader.
from vexriscv.
@kgugala By mapping you mean the different flags of each MMU TLB of VexRiscv (https://github.com/SpinalHDL/VexRiscv/blob/master/src/main/scala/vexriscv/plugin/MemoryTranslatorPlugin.scala#L51) ? If the given feature aren't enough, i'm happy to fix that in the first place
from vexriscv.
@daveshah1 yes, it can. But that makes things even more complicated as two pieces of software will have to be maintained.
@Dolu1990 the flags were sufficient. One of the missing part is variable map size. AFAIK right now you can map only 4k pages. This made mapping of the whole kernel space impossible - the MMU's map table is to small to fit so many 4k entries. This is the reason we added this constant kernel space mapping hack. Also, in user space, there are many mappings for different contexts. Those mappings are switched very often, so rewriting those every time with 2 custom instructions for every 4k page is very slow.
We haven't tested properly if the reloading is done properly, and if the mappings are refreshed correctly in the MMU itself. This, IMO, is the reason of a segfault we're seeing in user space.
from vexriscv.
@kgugala the initial idea tohandle pages bigger than 4KB was to just translate them on demand to 4KB ones in the TLB
For example
Access at vitual address 0x1234568, via a 16 MB page which map 0x12xxxxxx to 0xABxxxxxx =>
Software emulation which add in the TLB cache a 4KB TLB which map 0x12345xxx to 0xAB345xxx
But now that i think about it, maybe the support of 16MB pages can be added for very few hardware addition over the exisiting solution.
The software model should also be able to indirectly pick up MMU translation errors :)
from vexriscv.
@Dolu1990: the simulation source is here:
https://github.com/enjoy-digital/litex/blob/master/litex/utils/litex_sim.py
and
https://github.com/enjoy-digital/litex/tree/master/litex/build/sim
With a vmlinux.bin with the .dtb appended, we can run linux with on mor1kx with:
litex_sim --cpu-type=or1k --ram-init=vmlinux.bin
For now for Vexriscv, i was hacking the ram initialization function to aggregate the vmlinux.bin, vmlinux.dtb and initramdisk.gz, but i'll thinking about using a .json file to describe how the ram needs to be initialized:
{
"vmlinux.bin": 0x00000000,
"vmlinux.dtb": 0x01000000,
"initramdisk.gz": 0x01002000,
}
and then just do:
litex_sim --cpu-type=vexriscv --ram-init=ram_init_linux.json
from vexriscv.
The software right now maps the pages on demand.
@Dolu1990 The problem is that kernel space has to be mapped for the whole time. The whole kernel runs in S mode in virtual memory. This space cannot be unmapped, because any interrupt/exception (including TLB miss) etc may happen at any time. We cannot end up in a situation where TLB miss causes jump to a handler, which is not mapped at the moment causing another TLB miss. This would end up in terrible miss->handler->miss loop
from vexriscv.
@enjoy-digital Ahh, ok, so it is a SoC level simulation. I think the best would realy be to stick to a raw CPU simulation in Verilator, to realy keep a full control over the CPU, and keep it raw nature, and keep simulation performance as high as possible to reduce sim time.
@kgugala This is the purpose of Machine mode emulation. Basicaly, in machine mode, the MMU translation is off, and the cpu can do all sort of things, without the supervisor mode even being able to notice it.
There is the schedule of a user space TLB miss :
-
Use space TLB miss
-
It trigger a machine mode exception
-
The machine mode MMU software refiller check the TLB in the main memory
-
If there is a memory TLB existing, it refill the hardware MMU and return into user mode without supervisor even knowing
-
If there was no memory TLB to map required access, it emulate a supervisor exception and return the execution to the supervisor.
from vexriscv.
@daveshah1 this is awesome
from vexriscv.
@daveshah1 Great :D
from vexriscv.
What do you think about no-MMU support for Linux on RISC-V? Would it be possible? That would require hacking the kernel, instead of VexRiscv, of course.
from vexriscv.
Awesome @daveshah1!
from vexriscv.
@wm4: https://en.wikipedia.org/wiki/MClinux
from vexriscv.
@daveshah1 on what platform do you run it? Do you run it with the ramdisk you shared before? I tried to run it and it seems to be stucking at:
[ 0.000000] RAMDISK: gzip image found at block 0
I boot linux commit d27b7d5cb658ccb9ade4bea6a12feb08ebdcc541
from vexriscv.
Reuploading ramdisk just in case, but don't think there have been any changes.
The kernel requires the LiteX timer to be connected to the VexRiscv timerInterruptS, and the cycle/cycleh CSRs to work. ime 'stuck during boot' has generally been timer-related problems.
My platform:
- Lattice ECP5 Versa-5G development board
- https://github.com/daveshah1/versa_ecp5_dram/tree/ethsoc - ethernet target, built with trellis
- https://github.com/daveshah1/VexRiscv/tree/Supervisor
- https://github.com/daveshah1/VexRiscv-verilog/tree/linux
- https://github.com/daveshah1/litex/tree/vexriscv-linux
- https://github.com/daveshah1/litex-linux-riscv
- https://github.com/daveshah1/yosys/tree/ecp5_transp_dram
- https://github.com/daveshah1/nextpnr/tree/placer_heap
from vexriscv.
This must be the timer interrupt then. I'll add this to my test platform
from vexriscv.
Oh, I see you run it with the latest Litex. I tried it on the system we used for the initial work (from December 2018). I have to rebase our changes
from vexriscv.
I bumped all the parts and have it running on Arty :)
from vexriscv.
Awesome! I just pushed some very basic kernel-mode emulation of atomic instructions, which has improved software compatibility a bit (the current implementation I've done isn't actually atomic yet, as it ignores acquire/release for now...)
from vexriscv.
@Dolu1990 If I were to use RiscvGolden as you have suggested, would I run it with
VexRiscv/src/test/cpp/regression$ make DEBUG_PLUGIN_EXTERNAL=yes
Then connect openocd with
openocd$ openocd -c "set VEXRISCV_YAML cpu0.yaml" -f tcl/target/vexriscv_sim.cfg
Then load vmlinux, dtb and initrd over gdb. I just want to make sure to use it as expected.
from vexriscv.
@Dolu1990: yes that seems sensible and the way to go for the long term. Getting to the actual situation was i think the hard work (thanks to all) and we now know what's need to be improved.
Not sure we'll run the Linux SoC on small FPGAs (the ice40 boards we will be lacking external memory or resources), so even if the new MMU use a bit more resources, it will be ok. The current situation still allow us to improve things that are not directly related to VexRiscv and the MMU: test it on various targets to verify LiteDRAM is working correctly on all targets (testing it on ULX3S will also be interesting), load code from SPI Flash, SDCard, integrate others cores, start working on drivers, etc... So plenty to do :)
from vexriscv.
@roman3017
Ok, so i would say, for the moment don't botter about the golden model, we will come to it after the hardware rework :)
from vexriscv.
With spec compliant MMU it will be much easier to push the 32 bit Linux code upstream, and make Vex the first supported 32 bit CPU by the Linux mainline.
I agree that simpler MMU in the spec may be beneficial for FPGA implementations. I'll rise this topic on the next RISC-V Soft Cores work group meeting.
from vexriscv.
@kgugala I'm not sure a simpler MMU spec is required :) It should be fine with the actual one. I will try it.
from vexriscv.
Other remaining issues for Linux mainline support as well as the MMU:
- time/timeh CSRs (shouldn't be hard)
- SBI stub for setting the timer - perhaps the timer should be built into VexRiscv (and thus tied to the time CSR) rather than provided by LiteX
- Interrupt control, either getting the VexRiscv interrupt driver upstream, working on a PLIC implementation, or emulating a proper PLIC in M-mode
- Atomics, re-adding the proper atomic instructions in Linux that upstream uses and removing the userspace atomic emulation that I added to the kernel. Probably doing this in M-mode is easier than adding all the amo* instructions
- LiteX UART and liteeth Ethernet, either replacing these with modules with an upstream driver or getting the drivers for these upstream
- Fences/cache flush instructions
from vexriscv.
@Dolu1990 Thank you very much for your explanation. I like the long term plan and will wait for HW changes for now.
from vexriscv.
I created a new branch :
https://github.com/SpinalHDL/VexRiscv/tree/linux
The goal there would be to develop the hardware refilled MMU + new data cache design and have the "raw" test environnement.
- time/timeh => i agree, and maybe later, when everything is ok, we can think about emulation (would save ~128 luts)
- Timer => let's be as close as possible than the reference implementation ^.^ => + 1
- PLIC => there is one already implemented in SpinalHDL, it can be used for test purposes, but probably a migen one will be required to have flexibility in the LiteX MiSoC flow.
from vexriscv.
Got the VexRiscv with the new MMU design + cache less to pass all the standard regression (which aren't using the MMU)
On Artix7, the MMU (untested) cost about 250 LUT + 400 register for 4 iTLB + 4 dTLB
from vexriscv.
Very good! 250 LUTs is certainly not a problem, the existing design only uses about 50% of the Versa's ECP5 45k.
from vexriscv.
Todo :
- Implementing all the CSR missing in the golden model : https://github.com/SpinalHDL/VexRiscv/blob/master/src/test/cpp/regression/main.cpp#L330
- Implementing RISCV compliant MMU tests : https://github.com/SpinalHDL/VexRiscv/blob/linux/src/test/cpp/raw/mmu/src/crt.S
- Implementing Supervisor regressions
Just let's me know if you want to pick one ^^
from vexriscv.
We should most certainly look at getting liteeth and litex uart drivers upstream. @shenki was looking at this a while back I believe.
from vexriscv.
We should also be able to share the liteeth and litex UART between or1k and riscv support, so @stffrdhrn might also be interested.
from vexriscv.
@mgielda is probably interested in this issue too.
from vexriscv.
@mithro I'm interested. Last I checked (~3 months ago) these drivers both worked on openrisc in qemu and on Arty.
I didn't read the whole conversation. Want me to clean them up and submit upstream? Or is someone else working on it?
from vexriscv.
I would suggest that @daveshah1, @shenki and @stffrdhrn should coordinate on cleaning the LiteX uart and LiteEth drivers into the upstream kernel? Don't know the best way to do that however...
BTW There is a linux-litex Google Group / Mailing list.
from vexriscv.
daveshah1/litex-linux-riscv@5f4338e makes the driver little endian for RISC-V, this would need generifcying.
I also made a few fixes along the way:
daveshah1/litex-linux-riscv@5c46c48 fixes a serious memory leak
daveshah1/litex-linux-riscv@ac29e8f fixes a panic
daveshah1/litex-linux-riscv@116b2d2 may or may not actually fix anything
The fact that I found at least two serious issues in a short period of time make me think some more testing is warranted.
We should also look at performance, right now this is peaking at about 150kB/s for me, I am hoping to use this for a rootfs on NFS (the Versa has no other mass storage options by default). I don't know how much of the performance problem is just the CPU/MMU stuff and how much is the driver/core.
from vexriscv.
@daveshah1: to give you an idea, with mor1kx wget was 380KB/s with a 50MHz cpu. We were discussing about adding DMA to LiteEth to improve that.
from vexriscv.
@mithro, @daveshah1, @shenki @stffrdhrn: to try to coordinate the work on the drivers and avoid polluting to much this current issue, i just created some issues for the drivers in https://github.com/antmicro/litex-linux-riscv:
antmicro/litex-linux-riscv#1
antmicro/litex-linux-riscv#2
from vexriscv.
@daveshah1 I'm sure you will find a lot more bugs when cleaning up and testing.
FYI We collected some discussion about adding LiteEth DMA in this Google Doc.
from vexriscv.
Currently testing the self refilled MMU
from vexriscv.
The state now is :
- The self refilled MMU is passing all the tests of : https://github.com/SpinalHDL/VexRiscv/blob/linux/src/test/cpp/raw/mmu/src/crt.S
- The C++ golden model is enabled, and check the hardware CPU cycle by cycle
- Cache less design for the moment, to increase the stability.
I have to say, I'm realy not much experienced into lowlevel linux stuff.
So, i need to know where are all the configs related to the plateform, and how to build the linux
Do you have some documentation about it ?
Thanks :)
from vexriscv.
@Dolu1990: do you want to also test on hardware? If so i can prepare a design for you where you'll be able to
insert the generated Vexriscv. I just need to know which FPGA board you have with DRAM and Ethernet.
from vexriscv.
@daveshah1 Thanks :D
@enjoy-digital Currently, i will try to stay in simulation, because of the software model which check the VexRiscv behaviour. But then, if the simulation is reaaaaly too slow. Sure, i will ask you :)
About the VexRiscv repo, to run the tests :
git clone https://github.com/SpinalHDL/SpinalHDL.git -b dev
git clone https://github.com/SpinalHDL/VexRiscv.git -b linux
cd VexRiscv
sbt "runMain vexriscv.demo.LinuxGen"
cd src/test/cpp/regression
make run IBUS=SIMPLE DBUS=SIMPLE REDO=10 DHRYSTONE=yes COMPRESSED=yes TRACE=no
It will take some time to generate the core the first time you run it, as it use an unreleased version of SpinalHDL.
from vexriscv.
Where can I find documentation on getting this to run on renode?
from vexriscv.
@Dolu1990 The mmu tests work great. I have also tried to run the following command instead of mmu tests:
Workspace("run").withRiscvRef()->noInstructionReadCheck()->run(0xFFFFFFFFFFFF);
Then connecting openocd still works. But connecting gdb afterwards is for some reason crashing it:
BOOT
CONNECTED
makefile:181: recipe for target 'run' failed
make: *** [run] Segmentation fault (core dumped)
I have compiled vmlinux and was hoping to load it over gdb. I have also converted elf to hex and tried loadHex("vmlinux.hex")
and bootAt(0xc0000000)
but cannot connect gdb. Likely I am not using it as expected.
from vexriscv.
Hmmm have to check that.
Also, i will have some fixes to do in openocd to manage the MMU stuff properly.
Anyway, i think the best is to create a dedicated class which extend Workspace. Then in it we can redefine the memeory mapping that we need, and also load the binaries directly without using the JTAG stuff :)
So, just have a fiew things to fix myself on Vex, then i setup a minimal workspace that we can extend to emulate the peripheral we need.
Things i'm currently fixing :
- VexRiscv wasn't implementing the interruption flags exactly as the spec is saying. I'm fixing that now.
- I'm also adding more regression test around the privilege modes and the delegation stuff.
I will tell you as soon it's done :)
from vexriscv.
@roman3017 Ahhh now i get it, if you want to use the debug interface in the sim you should not use the withRiscvRef() stuff. Has the VexRiscv software model do not include the debug interface yet.
from vexriscv.
@Dolu1990 Thank you very much for explanation. I will use SoC on FPGA for now.
from vexriscv.
simulation command update :
make run DBUS=SIMPLE IBUS=SIMPLE SUPERVISOR=yes CSR=yes COMPRESSED=yes TRACE=no
the jtag is broken, but that's fine for the moment, 10 test will fail because of that.
from vexriscv.
@kgugala @daveshah1 @enjoy-digital @roman3017
I pushed everything required to run the simulation and to load the linux in. .
There is some notes how to use the thing in case of :
https://github.com/SpinalHDL/VexRiscv/blob/linux/src/main/scala/vexriscv/demo/Linux.scala#L30
Now, we have to make some choices together ^.^
It crash on the second instruction, which is related to the interrupt controller :
csrw VEXRISCV_CSR_IRQ_MASK, zero
c0000004: bc001073 csrw 0xbc0,zero
This is for the none RISCV interrupt controller added inside VexRiscv for Misoc/Litex compatibility.
Do we want to keep it for the Linux stuff ? or we move on a regular RISC-V design ?
Which mean :
input timerInterrupt,
input externalInterrupt,
input softwareInterrupt,
input externalInterruptS, //(Supervisor)
So from the spec, there is no input to set the supervisor timer interrupt pending (STIP), it is done via machine mode which has to set the STIP flag durring a machine timer interrupt.
from vexriscv.
Just for the info :
LitexSoC peripheral emulation isn't written :
VexRiscv/src/test/cpp/regression/main.cpp
Line 2878 in 6c0608f
LitexSoC workspace usage :
VexRiscv/src/test/cpp/regression/main.cpp
Line 3154 in 6c0608f
from vexriscv.
About the PLIC, its parametrization can greatly reduce the footprint, basicaly, "removing" the priority stuff by setting it's width to one bit, and hard-wireing all the gateway priority to 1 and all the targets threshold to zero.
I will do some ice40 benches to get a better idea of the final footprint.
from vexriscv.
Looks good. I'm happy with the PLIC solution, so long as it doesn't cause too much trouble with the LiteX integration cc @enjoy-digital
from vexriscv.
I'm also fine with the PLIC solution, i need to look at that but don't think it will be too complex to integrate in LiteX.
from vexriscv.
About the Linux requirements. Which part of the RISC-V Atomic extension is used ? Only LR and SC ? Is that right ?
32-bit:
https://github.com/daveshah1/litex-linux-riscv/blob/master/arch/riscv/include/asm/atomic.h
Looks like only LR and SC are used.
from vexriscv.
So, i made some experiment with the head of the main riscv-linux repo.
The objective is to go as far as possible without any change into the kernel, and emulating all missing feature in machine mode via an emulator.
The sources are here :
https://github.com/SpinalHDL/VexRiscv/tree/linux/src/main/c/emulator/src
It already work a bit :
[ 0.000000] Linux version 4.20.0-g8fe28cb (spinalvm@spinalvm-VirtualBox) (gcc version 7.2.0 (GCC)) #1 Sun Mar 24 20:18:48 CET 2019
[ 0.000000] printk: bootconsole [early0] enabled
π
Then it trigger
BUG_ON(mem_size == 0);
c000419c: 00079463 bnez a5,c00041a4 <setup_arch+0x140>
c00041a0: 00100073 ebreak
I will look further tomorrow
Anyway thanks all for the tips/help/commands/codes, it realy helped :)
@futaris In fact, there was atomic instruction (amoxxx) veeeery early and at multiple places in the binary. I had to emulate them in the machine mode.
from vexriscv.
Looks like arch/riscv/include/asm/futex.h
is where they come from in mainline linux.
And it looks like userspace linux, needs the "A" (atomic) extension:
ivmai/libatomic_ops#31 (comment)
from vexriscv.
Oh, and I'm not sure how to support !CONFIG_GENERIC_ATOMIC64 with __riscv_xlen < 64 ... I think that we'd have to do something similar to what is done in arch/arm/include/asm/atomic.h
.
from vexriscv.
Looks like your code for AMOxxx opcodes are similar to daveshah1's :
daveshah1/litex-linux-riscv@a9819e6#diff-48943b18b315b64e8efabc4035b9ed19R114
from vexriscv.
https://groups.google.com/a/groups.riscv.org/forum/#!topic/sw-dev/XVha867D0y0
from vexriscv.
If you want to try building a rootfs without atomics, try buildroot.
If you disable BR2_RISCV_ISA_RVA, then you'd need to enable support in uClibc or musl.
glibc needs atomics though.
from vexriscv.
openembedded / yocto is another alternative.
https://github.com/riscv/meta-riscv
glibc 32-bit support is still not upstream.
riscv/meta-riscv@ab1ebdc
@alistair23 seems to be working on 32-bit linux support in meta-riscv.
from vexriscv.
I don't think an atomic free RISC-V userspace is possible. glibc and musl won't compile without atomics, although I haven't looked at uclibc. In practice I was finding simple C stuff such as busybox was tending not to call any atomic instructions, but C++ stuff was calling them at startup. This is why I ended up implementing the kernel mode emulation of them.
from vexriscv.
It is possible to add support so that we can have an atomic free RISC-V userspace and kernel, by making sure your gcc doesn't emit atomic instructions, by selecting the correct arch.
e.g. -march=rv32i
or -march=rv32ima (for something that supports multiply and atomic)
https://gcc.gnu.org/wiki/Atomic
https://github.com/gcc-mirror/gcc/blob/master/libgcc/config/arm/linux-atomic.c
You would need to do software atomic, in the kernel, like the above.
As of this writing, there are no A routine emulations because they were rejected as part of the Linux upstreaming process -- this might change in the future, but - for now - we plan to mandate that Linux-capable machines subsume the A extension as part of the RISC-V platform specification.
https://www.sifive.com/blog/all-aboard-part-1-compiler-args
Itβs only possible to emulate the A extension on single processor machines, where it happens to be very cheap to implement the A extension. Thus, it seemed simpler to reduce the number of ABIs supported (4 instead of 6). If someone decides to build non-A, Linux-capable machines then weβll re-evaluate the situation.
https://forums.sifive.com/t/questions-about-all-aboard-series-part-1/781
from vexriscv.
Related Issues (20)
- Combinatorial loop with AhbLite3Decoder HOT 1
- Transfer data double times HOT 1
- Murax XIP compile issue HOT 11
- EmbeddedRiscvJtag synthesis issue HOT 1
- CPU exception signal HOT 3
- Regarding the result of dhrystone with TCM HOT 6
- debug HOT 1
- Fetch dosen't performed correctly in the simulation of Murax SOC.(+Custom instructions are executed in unexpected time.) HOT 1
- Instructions to save/restore register to stack is taking 2 clock each HOT 12
- DE0-Nano Board with VexRiscV: IO and Fit Design Issues Including Specific Command Used HOT 3
- Adding VexRiscV as a dependency HOT 2
- Data Stream in/out SoC <-> FPGA HOT 6
- FPU plugin to GenFull.scala HOT 3
- EU Funding HOT 3
- Compile C code and run bare metal cycle accurate simulation HOT 3
- Debug instructions executed twice HOT 5
- Exit cycle accurate simulation HOT 1
- Problems with adding FPU in Briey HOT 5
- Problem about how to compile the software that can be used in Vexriscv with FPU HOT 10
- How to use printf function? HOT 10
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vexriscv.