GithubHelp home page GithubHelp logo

corev-gcc's People

Contributors

aldyh avatar andrewwmacleod avatar arnaudcharlet avatar davidmalcolm avatar edschonberg avatar hjl-tools avatar hpataxisdotcom avatar iains avatar ibuclaw avatar jakubjelinek avatar jamborm avatar jicama avatar jsm28 avatar jwakely avatar marxin avatar mpolacek avatar nickclifton avatar ptroja avatar rguenth avatar rorth avatar rsandifo avatar rsandifo-arm avatar segher avatar sprintersb avatar tob2 avatar tschwinge avatar ubizjak avatar urnathan avatar vnmakarov avatar zhongjuzhe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

corev-gcc's Issues

GCC requires version number to be specified with architecture extensions

With the latest released binaries (see https://www.embecosm.com/resources/tool-chain-downloads/#corev), it seems that the compiler requires version numbers to be specified with the CORE-V extension name. For example:

    /home/schiavon/tools/corev-riscv/bin/riscv32-corev-elf-gcc   -g -march=rv32imac_ziscr_xcvmem_xcvhwlp_xcvalu_xcvbi_xcvmac_xcvelw_xcvbitmanip_xcvsimd -g -march=rv32imac_ziscr_xcvmem_xcvhwlp_xcvalu_xcvbi_xcvmac_xcvelw_xcvbitmanip_xcvsimd    -o CMakeFiles/cmTC_83683.dir/testCCompiler.c.obj   -c /home/schiavon/gitdir/davide_obi_fifo/x-heep/sw/build/CMakeFiles/CMakeTmp/testCCompiler.c
    Assembler messages:
    Error: rv32imac_ziscr_xcvalu_xcvbi_xcvbitmanip_xcvelw_xcvhwlp_xcvmac_xcvmem_xcvsimd: unknown prefixed ISA extension `ziscr'

Instead you have to use

rv32imc_zicsr_zifencei_xcvhwlp1p0_xcvmem1p0_xcvmac1p0_xcvbi1p0_xcvalu1p0_xcvsimd1p0_xcvbitmanip1p0

Version numbers should default and in any case there is only one version!

Hardware Loops march option not recognized

When compiling Embench tests, following error is reported:

riscv32-corev-elf-gcc: error: '-march=rv32imc_zicsr_zifencei_xcvhwlp_xcvmem_xcvmac_xcvbi_xcvalu_xcvsimd_xcvbitmanip': extension 'xcvhwlp' starts with 'x' but is unsupported non-standard extension

Incorrect code generation with PULP march with 20231017 release

When compiling Embench tests, following errors are reported:

march=rv32imc_zicsr_zifencei_xcvmem_xcvmac_xcvbi_xcvalu_xcvsimd_xcvbitmanip

  • sglib-combined
    Assembler messages:
    Error: illegal operands `sb zero,a2(a5)'
  • st
    error: unrecognizable insn:
    133 | }
    | ^
    (insn 13 12 14 4 (set (reg:DF 12 a2)
    (mem:DF (plus:SI (reg/v/f:SI 140 [ Array ])
    (reg:SI 143)) [2 MEM[(double *)Array_16(D) + _9 * 1]+0 S8 A64])) "embench/src/st/libst.c":131:10 -1
    (nil))
    during RTL pass: vregs
    embench/src/st/libst.c:133:1: internal compiler error: in extract_insn, at recog.cc:2791

cv-simd-shufflei* incorrect builtin

Definition:

cv.shuffleI0.sci.b  rD,rs1,Is2   ;; flgs[7:6] = 0

Current implementation:

cv.shuffleI0.sci.b  rD,rs1,Is2   ;; Is2 = 0

Fix required:

  • Print Is2 as a 6-bit unsigned immediate, but match an 8-bit unsigned immediate.
  • Change the predicates to check for 8-bit constant.
  • Change the constraint CF* to check bits [7:6]?
  • Change testcases.
    • cv-xcvsimd-march-compile-1.c
    • cv-simd-shufflei*-sci-b-compile-1.c

How to check COREV_CLUSTER for Event Load Instruction

This is a question from llvm openhwgroup/corev-llvm-project#76

I'd like to learn how gcc implements this feature

Event Load Instruction
The event load instruction cv.elw is only supported if the COREV_CLUSTER parameter is set to 1.

But how does the compiler detect the COREV_CLUSTER parameter.
or
The compiler does not require special checking and the user controls whether this extension is enabled or not?

ref:
https://github.com/openhwgroup/cv32e40p/blob/master/docs/source/instruction_set_extensions.rst#event-load-instruction

thanks

DWARF error: mangled line number section

I got an error message when using the toolchain on an internal code base when doing riscv32-corev-elf-objdump -S <path/to/ELF>:

corev-openhw-gcc-ubuntu2004-20221031/bin/riscv32-corev-elf-objdump: DWARF error: mangled line number section

The code was compiled with the following flags: -mabi=ilp32e -march=rv32emzicsr_zba_zbb_zbc_zbs_zca_zcb_zcmb_zcmt

GCC build info (downloaded from embecosm website):

Using built-in specs.
COLLECT_GCC=/home/tone/riscv-gcc/corev-openhw-gcc-ubuntu2004-20221031/bin/riscv32-corev-elf-gcc
COLLECT_LTO_WRAPPER=/home/tone/riscv-gcc/corev-openhw-gcc-ubuntu2004-20221031/bin/../libexec/gcc/riscv32-corev-elf/12.0.1/lto-wrapper
Target: riscv32-corev-elf
Configured with: ../../gcc/configure --target=riscv32-corev-elf --prefix=/build/workspace/corev-gcc-ubuntu2004/install --with-sysroot=/build/workspace/corev-gcc-ubuntu2004/install/riscv32-corev-elf --with-native-system-header-dir=/include --with-newlib --disable-shared --enable-languages=c,c++ --enable-tls --disable-werror --disable-libmudflap --disable-libssp --disable-quadmath --disable-libgomp --disable-nls --enable-multilib --with-multilib-generator='rv32e-ilp32e--c rv32ea-ilp32e--m rv32em-ilp32e--c rv32eac-ilp32e-- rv32emac-ilp32e-- rv32i-ilp32--c rv32ia-ilp32--m rv32im-ilp32--c rv32if-ilp32f-rv32ifd-c rv32iaf-ilp32f-rv32imaf,rv32iafc-d rv32imf-ilp32f-rv32imfd-c rv32iac-ilp32-- rv32imac-ilp32-- rv32imafc-ilp32f-rv32imafdc- rv32ifd-ilp32d--c rv32imfd-ilp32d--c rv32iafd-ilp32d-rv32imafd,rv32iafdc- rv32imafdc-ilp32d-- rv64i-lp64--c rv64ia-lp64--m rv64im-lp64--c rv64if-lp64f-rv64ifd-c rv64iaf-lp64f-rv64imaf,rv64iafc-d rv64imf-lp64f-rv64imfd-c rv64iac-lp64-- rv64imac-lp64-- rv64imafc-lp64f-rv64imafdc- rv64ifd-lp64d--m,c rv64iafd-lp64d-rv64imafd,rv64iafdc- rv64imafdc-lp64d--' --with-arch=rv32imac --with-abi=ilp32 --with-bugurl=''\''https://www.embecosm.com'\''' --with-pkgversion=''\''corev-openhw-gcc-ubuntu2004-20221031'\'''
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 12.0.1 20220422 (experimental) ('corev-openhw-gcc-ubuntu2004-20221031')

I don't yet know how much code I can share so I'll have to get back to you on how to make a minimal viable example to reproduce if needed.

Hardware loops causes relocation truncated to fit: R_RISCV_CVPCREL_UI12

Reproducer

File test.c.tar.gz:

int *c;
int d[12];
int e;
_start ()
{
  volatile a = b ();
}

f ()
{
  short g, h;
  for (; e;)
    {
      g = 0;
      for (; g < 8; g++)
	{
	  h = 0;
	  for (; h < 4; h++)
	    d[h] = c[h] = d[0];
	}
    }
}

b ()
{
  return f;
}

Run with

riscv32-corev-elf-gcc -march=rv32imac_zicsr_xcvhwlp -mabi=ilp32 -Os \
    -Wno-implicit-function-declaration -Wno-implicit-int -Wno-int-conversion \
    -nostdlib -nostartfiles -o test.exe test.c

Output is

/tmp/ccJN19DD.o: in function `.L9':
test.c:(.text+0x3c): relocation truncated to fit: R_RISCV_CVPCREL_UI12 against `.L3'
collect2: error: ld returned 1 exit status

System information

Using built-in specs.
COLLECT_GCC=riscv32-corev-elf-gcc
COLLECT_LTO_WRAPPER=/home/jeremy/gittrees/dolphin/install/libexec/gcc/riscv32-corev-elf/14.0.0/lto-wrapper
Target: riscv32-corev-elf
Configured with: ../../gcc/configure --target=riscv32-corev-elf --prefix=/home/jeremy/gittrees/dolphin/install --with-sysroot=/home/jeremy/gittrees/dolphin/install/riscv32-corev-elf --with-native-system-header-dir=/include --with-newlib --disable-shared --enable-languages=c,c++ --enable-tls --disable-werror --disable-libmudflap --disable-libssp --disable-quadmath --disable-libgomp --disable-nls --enable-multilib --with-multilib-generator='rv32i-ilp32--c                                    rv32ia-ilp32--m                      	     rv32im-ilp32--c                      	     rv32if-ilp32f-rv32ifd-c              	     rv32iaf-ilp32f-rv32imaf,rv32iafc-d   	     rv32imf-ilp32f-rv32imfd-c            	     rv32iac-ilp32--                      	     rv32imac-ilp32--                     	     rv32imafc-ilp32f-rv32imafdc-         	     rv32ifd-ilp32d--c                    	     rv32imfd-ilp32d--c                   	     rv32iafd-ilp32d-rv32imafd,rv32iafdc- 	     rv32imafdc-ilp32d--' --with-arch=rv32imac --with-abi=ilp32 : (reconfigured) ../../gcc/configure --target=riscv32-corev-elf --prefix=/home/jeremy/gittrees/dolphin/install --with-sysroot=/home/jeremy/gittrees/dolphin/install/riscv32-corev-elf --with-native-system-header-dir=/include --with-newlib --disable-shared --enable-languages=c,c++ --enable-tls --disable-werror --disable-libmudflap --disable-libssp --disable-quadmath --disable-libgomp --disable-nls --enable-multilib --with-multilib-generator='rv32i-ilp32--c                                    rv32ia-ilp32--m                      	     rv32im-ilp32--c                      	     rv32if-ilp32f-rv32ifd-c              	     rv32iaf-ilp32f-rv32imaf,rv32iafc-d   	     rv32imf-ilp32f-rv32imfd-c            	     rv32iac-ilp32--                      	     rv32imac-ilp32--                     	     rv32imafc-ilp32f-rv32imafdc-         	     rv32ifd-ilp32d--c                    	     rv32imfd-ilp32d--c                   	     rv32iafd-ilp32d-rv32imafd,rv32iafdc- 	     rv32imafdc-ilp32d--' --with-arch=rv32imac --with-abi=ilp32
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230928 (experimental) (GCC)

Getting CORE-V ready for Upstream

Status: Links to Work in Progress Branches:

Status: Links to Patch Reviews:

Status: Upstream

  • [RISC-V] Add support for XCVmac extension in CV32E40P
  • [RISC-V] Add support for XCValu extension in CV32E40P
  • [RISC-V] Add support for XCVbitmanip extension in CV32E40P
  • [RISC-V] Add support for XCVbi extension in CV32E40P
  • [RISC-V] Add support for XCVelw extension in CV32E40P
  • [RISC-V] Add support for XCVmem extension in CV32E40P
  • [RISC-V] Add support for XCVsimd extension in CV32E40P
  • [RISC-V] Add support for XCVhwlp extension in CV32E40P

Attribute interrupt causes register corruption when Zcmp enabled

Repro: int.c:

void __attribute__((noinline)) foobar() {
	asm volatile ("" : : : "memory");
}

void __attribute__((interrupt)) handle_exception() {
	foobar();
}

Run:

$ riscv32-corev-elf-gcc -march=rv32i_zca_zcmp -c int.c
$ riscv32-corev-elf-objdump -D -j .text int.o

Disassembly:

0000000e <handle_exception>:
   e:	b85e                	cm.push	{ra,s0},-64
  10:	1141                	addi	sp,sp,-16
  12:	c416                	sw	t0,8(sp)
  14:	c21a                	sw	t1,4(sp)
  16:	c01e                	sw	t2,0(sp)
  18:	fea12e23          	sw	a0,-4(sp)
  1c:	feb12c23          	sw	a1,-8(sp)
  20:	fec12a23          	sw	a2,-12(sp)
  24:	fed12823          	sw	a3,-16(sp)
  28:	fee12623          	sw	a4,-20(sp)
  2c:	fef12423          	sw	a5,-24(sp)
  30:	ff012223          	sw	a6,-28(sp)
  34:	ff112023          	sw	a7,-32(sp)
  38:	fdc12e23          	sw	t3,-36(sp)
  3c:	fdd12c23          	sw	t4,-40(sp)
  40:	fde12a23          	sw	t5,-44(sp)
  44:	fdf12823          	sw	t6,-48(sp)
  48:	0880                	addi	s0,sp,80
  4a:	00000097          	auipc	ra,0x0
  4e:	000080e7          	jalr	ra # 4a <handle_exception+0x3c>
  52:	0001                	nop
  54:	0141                	addi	sp,sp,16
  56:	ba5e                	cm.pop	{ra,s0},64
  58:	fb012283          	lw	t0,-80(sp)
  5c:	fac12303          	lw	t1,-84(sp)
  60:	fa812383          	lw	t2,-88(sp)
  64:	fa412503          	lw	a0,-92(sp)
  68:	fa012583          	lw	a1,-96(sp)
  6c:	f9c12603          	lw	a2,-100(sp)
  70:	f9812683          	lw	a3,-104(sp)
  74:	f9412703          	lw	a4,-108(sp)
  78:	f9012783          	lw	a5,-112(sp)
  7c:	f8c12803          	lw	a6,-116(sp)
  80:	f8812883          	lw	a7,-120(sp)
  84:	f8412e03          	lw	t3,-124(sp)
  88:	f8012e83          	lw	t4,-128(sp)
  8c:	f7c12f03          	lw	t5,-132(sp)
  90:	f7812f83          	lw	t6,-136(sp)
  94:	30200073          	mret

The address to which t0 is saved is different from the address from which it is restored. Compare to the correct disassembly with -march=-march=rv32i_zca :

0000000e <handle_exception>:
   e:	715d                	addi	sp,sp,-80
  10:	c686                	sw	ra,76(sp)
  12:	c496                	sw	t0,72(sp)
  14:	c29a                	sw	t1,68(sp)
  16:	c09e                	sw	t2,64(sp)
  18:	de22                	sw	s0,60(sp)
  1a:	dc2a                	sw	a0,56(sp)
  1c:	da2e                	sw	a1,52(sp)
  1e:	d832                	sw	a2,48(sp)
  20:	d636                	sw	a3,44(sp)
  22:	d43a                	sw	a4,40(sp)
  24:	d23e                	sw	a5,36(sp)
  26:	d042                	sw	a6,32(sp)
  28:	ce46                	sw	a7,28(sp)
  2a:	cc72                	sw	t3,24(sp)
  2c:	ca76                	sw	t4,20(sp)
  2e:	c87a                	sw	t5,16(sp)
  30:	c67e                	sw	t6,12(sp)
  32:	0880                	addi	s0,sp,80
  34:	00000097          	auipc	ra,0x0
  38:	000080e7          	jalr	ra # 34 <handle_exception+0x26>
  3c:	0001                	nop
  3e:	40b6                	lw	ra,76(sp)
  40:	42a6                	lw	t0,72(sp)
  42:	4316                	lw	t1,68(sp)
  44:	4386                	lw	t2,64(sp)
  46:	5472                	lw	s0,60(sp)
  48:	5562                	lw	a0,56(sp)
  4a:	55d2                	lw	a1,52(sp)
  4c:	5642                	lw	a2,48(sp)
  4e:	56b2                	lw	a3,44(sp)
  50:	5722                	lw	a4,40(sp)
  52:	5792                	lw	a5,36(sp)
  54:	5802                	lw	a6,32(sp)
  56:	48f2                	lw	a7,28(sp)
  58:	4e62                	lw	t3,24(sp)
  5a:	4ed2                	lw	t4,20(sp)
  5c:	4f42                	lw	t5,16(sp)
  5e:	4fb2                	lw	t6,12(sp)
  60:	6161                	addi	sp,sp,80
  62:	30200073          	mret

I'm using the latest prebuilt corev-gcc from here: https://www.embecosm.com/resources/tool-chain-downloads/#corev

$ riscv32-corev-elf-gcc --version
riscv32-corev-elf-gcc ('corev-openhw-gcc-ubuntu2004-20230310') 12.0.1 20220422 (experimental)

corev-gcc pulp/xcorev extensions support

Hi there,
I'm currently moving our RTL+software setup from RI5CY to CV32E40P.

With PULP GCC I got pulp extensions to work (riscv32-unknown-elf-gcc -march=IMXpulpv2),
and I could see these custom instructions in the disassembly (riscv32-unknown-elf-objdump -Mmarch=IMXpulpv2).

But with the COREV flow I'm having a hard time to verify that "cv." commands are actually used/included.
(If I use riscv32-corev-elf-gcc -march=rv32im_xcorev)
I also seems there is no suitable march option with riscv32-corev-elf-objdump anymore.

It would be interesting to know what is the actual support status of these xcorev/pulp extensions in corev-gcc.
Examples I found never used march option *_xcorev, for a reason?

Thanks for your help!

.option norvc is ignored with Zca

Repro

Input file: tmp.S, which contains a compressible instruction whose compression should be inhibited by .option norvc:

.option push
.option norvc
	addi a0, a0, -1
.option pop

When assembled for RV32IC:

$ /opt/riscv/corev/bin/riscv32-corev-elf-gcc -c -march=rv32ic tmp.S && /opt/riscv/corev/bin/riscv32-corev-elf-objdump -d tmp.o

tmp.o:     file format elf32-littleriscv


Disassembly of section .text:

00000000 <.text>:
   0:	fff50513          	add	a0,a0,-1

This produces a 32-bit instruction, as requested. However when assembled with RV32IZca (an identical set of instructions):

$ /opt/riscv/corev/bin/riscv32-corev-elf-gcc -c -march=rv32i_zca tmp.S && /opt/riscv/corev/bin/riscv32-corev-elf-objdump -d tmp.o

tmp.o:     file format elf32-littleriscv


Disassembly of section .text:

00000000 <.text>:
   0:	157d                	add	a0,a0,-1

This results in a 16-bit opcode, even though compressed instructions were disabled via .option norvc.

Version

I tested with the latest prebuilt toolchain, dated 8th November 2023 on https://www.embecosm.com/resources/tool-chain-downloads/#corev:

$ /opt/riscv/corev/bin/riscv32-corev-elf-gcc --version
riscv32-corev-elf-gcc ('corev-openhw-gcc-ubuntu2204-20230905') 13.0.1 20230313 (experimental)
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Why this is a bug

Since we don't have an explicit wide instruction prefix like Arm's .w suffix, ,option norvc is the only way of inhibiting instruction compression in the assembler. Examples of cases where you may want to do this:

  • In contexts like interrupt vector tables, jump instructions may be expanded to word size so so that they have the correct architecturally-specified table offset
  • When a function head falls through into another function, and both functions have their entry points word-aligned for performance, deliberately widening instructions can achieve alignment without inserting nops
  • Computed branches may rely on forced-wide instructions to hand-calculate jr offsets in auipc; add; jr + imm sequences, which are faster than la (auipc + addi); add; jr
  • The entry NOP for semihosting sequences is required to be uncompressed

I thought I'd raise an issue because this seems like it might have been an oversight caused by C vs Zca confusion, rather than a deliberate change in behaviour. Please let me know if this was the wrong place to raise this. RV32IC and RV32IZca are identical instruction sets, so it is at least surprising that assembling the same file will give two different results.

HW Loop produces an ICE during building

Since the roll-forward a few weeks ago, building the latest commit is producing an ICE when XCVhwlp is enabled.

/home/user/git/openhw/install/riscv32-corev-elf/bin/ranlib libgcov.a
../../../../../../gcc/libgcc/unwind-dw2-fde.c: In function 'fde_radixsort':
../../../../../../gcc/libgcc/unwind-dw2-fde.c:674:1: error: insn does not satisfy its constraints:
  674 | }
      | ^
(insn 430 429 431 19 (set (reg:SI 70 lpstart0)
        (reg:SI 14 a4 [266])) 29242 {*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 14 a4 [266])
        (nil)))
during RTL pass: cprop_hardreg
../../../../../../gcc/libgcc/unwind-dw2-fde.c:674:1: internal compiler error: in extract_constrain_insn, at recog.cc:2713
0xa92ed9 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
        ../../../gcc/gcc/rtl-error.cc:108

movsi_internal is redefined in gcc/config/riscv/corev.md. The original commit specified move_dest_operand rather than the current nonimmediate_nonpostinc, but changing this to how it was in the original patch doesn't fully fix the issue.

To reproduce:

Compile the current CORE-V GCC development branch (currently based on commit a6c793a) with xcvhwlp enabled. (e.g. rv32im_zicsr_xcvhwlp_xcvmem_xcvmac_xcvbi_xcvalu_xcvsimd_xcvbitmanip-ilp32)

ra register is modified when Zcmp enabled

test.c:

#include <stdio.h>
#include <stdlib.h>

void test_uint64_shl_easy(void)
{
  unsigned long long p1 =  0x1111111122222222ULL;
  unsigned long long r =   0x3333333344444444ULL;
  unsigned int res = 0;

  printf("test shl\n");

  if (p1 << 1 == r)
  {
    res = 1;
  }
  return;
}

int main(void)
{
        test_uint64_shl_easy();
        printf("test main\n");
        return 0;
}

Build:

riscv32-unknown-elf-gcc -mabi=ilp32 -march=rv32ima_zca_zcb_zcmp_zcmt -mcmodel=medany -nostartfiles -fno-common -O0  test.c -o test

riscv32-unknown-elf-objdump -d  test > test.asm

Disassembly:

image

ra register is modified when unsigned long long p1 is written to the stack

CORE-V bitmanip extract failures with -O1

This is a stripped down version of a new test program. Attached the pre-processed source and generated assembly code. Compiled with

 riscv32-corev-elf-gcc -B/home/jeremy/gittrees/dolphin/build/gcc-stage2/gcc/ /home/jeremy/gittrees/dolphin/gcc/gcc/testsuite/gcc.target/riscv/cv-bitmanip-exec.c -fdiagnostics-plain-output -O1 -march=rv32i_xcvbitmanip -mabi=ilp32 -save-temps -ffat-lto-objects -fno-ident -ffunction-sections -fdata-sections -static -Wl,-gc-sections -specs=nano.specs -lm -Wl,-T/home/jeremy/gittrees/dolphin/toolchain/embecosm-link.ld -save-temps -ggdb -o ./cv-bitmanip-exec.exe

cv-bitmanip-exec.zip

When executed on FPGA, it fails through abort ()

(gdb) bt
#0  0x0000045c in abort ()
#1  0x00000448 in validate (v=<optimised out>, good=<optimised out>)
    at /home/jeremy/gittrees/dolphin/gcc/gcc/testsuite/gcc.target/riscv/cv-helpers.h:12
#2  main ()
    at /home/jeremy/gittrees/dolphin/gcc/gcc/testsuite/gcc.target/riscv/cv-bitmanip-exec.c:95
(gdb) 

We note that while there should be 2 instances each of extract, extractr, extractu and extractur in the generated output, only 2 instances of extractr appear.

This only happens with -O1. At higher or lower optimization levels it all behaves correctly, with the correct number of generated extract instructions.

Toolchain flags errors for xpulp SIMD ALU instuctions cv.or.sci, cv.xor.sci, cv.and.sci with negative (signed) Imm6 operand

Toolchain release:

corev-openhw-gcc-centos7-20230623

Issue Description

SIMD ALU operations
cv32e40p-user-manual-en-cv32e40p_v1.3.2.pdf , Table 7.31 SIMD ALU operations

cv.or.sci.h/b
cv.xor.sci.h/b
cv.and.sci.h/b

User Manual description:

cv.or[.sc,.sci]{.h,.b} rD, rs1, [rs2, Imm6] rD[i] = rs1[i] | op2[i]
cv.xor[.sc,.sci]{.h,.b} rD, rs1, [rs2, Imm6] rD[i] = rs1[i] ^ op2[i]
cv.and[.sc,.sci]{.h,.b} rD, rs1, [rs2, Imm6] rD[i] = rs1[i] & op2[i]

For these instructions' [.sci] or immediate Scalar replication mode, with Imm6 used as scalar operand2, the User Manual currently does not explicitly specify whether the Imm6 operand is "Zero-extended", so in such cases as per the description of [.sci] mode in the user manual, these instructions should consider this immediate operand to be sign-extended and as such the signed decimal notation used in assembly for these cases should be allowed (in-line with other sign-extended instructions in this SIMD ALU instructions category).

But with this release of toolchain, if the Imm6 field is negative decimal notation, the toolchain flags these as errors.
Error Example:

cv.or.sci.b t2, t2, -3
Error: immediate value must be 6-bit unsigned, -3 is out of range

cv.and.sci.h a6, s11, -24
Error: immediate value must be 6-bit unsigned, -24 is out of range

cv.xor.sci.h t1, s11, -13
Error: immediate value must be 6-bit unsigned, -13 is out of range

cannot find suitable multilib during the build, though it is shown in the toolchain lib folder.

When I’m trying to build another test with the new toolchain. I encountered the issue below.

image

This is the prebuild lib I saw in the toolchain folder.

image

Here are the compiler option used:

LAST_CC = /share/libraries/LIBRARY/0TECHNO/EMBECOSM/TOOLCHAIN/corev-openhw-gcc-centos7-20231128/bin/riscv32-corev-elf-gcc
LAST_CFLAGS = -D__RISCV_CV32__ -D__RISCV_GENERIC__ -DPANTHER -DPANTHER -DWITH_ALLOC -DRTOS_PMSIS -DCLUSTER_COMPILATION -DIS_NOT_VEP -I/scorpion/home/lch/project/panther_release/last/libs/dsp/include -I/drivers/include -fdata-sections -ffunction-sections -DARCHI_CLUSTER_NB_PE=16 -DCLUSTER_COMPILATION -MMD -march=rv32imc_zicsr_zfinx_xcvalu_xcvbi_xcvbitmanip_xcvhwlp_xcvmac_xcvmem_xcvsimd_xcvelw -Wall -Wextra -Wno-unused-variable -Wno-unused-parameter -fdata-sections -fno-common -msmall-data-limit=0 -ffunction-sections -DWITH_ALLOC -Wno-missing-field-initializers -g -O3 -DPRINTF_FLOAT -DMEASUREMENT -I.
LAST_LDFLAGS = -march=rv32imc_zicsr_zfinx_xcvalu_xcvbi_xcvbitmanip_xcvhwlp_xcvmac_xcvmem_xcvsimd_xcvelw -Wl,--gc-sections -Wl,-Map=/net/eagle/volume1/homes/lch/project/examples/dsp/mfcc_f32/build.riscv_vp/panther/build/output.map -static -nostdlib -lm -lgcc

Toolchain flags errors for xpulp SIMD ALU instuctions cv.avgu.sci.{.h/.b} with 6-bit unsigned decimal notation for Imm6 operand

Toolchain release:

corev-openhw-gcc-centos7-20230623

Issue Description:

XPULP SIMD ALU Instruction:

avgu.sci.h/b

SIMD ALU operations
cv32e40p-user-manual-en-cv32e40p_v1.3.2.pdf , Table 7.31 SIMD ALU operations

User Manual description:

cv.avgu[.sc,.sci]{.h,.b} rD, rs1, [rs2, Imm6] rD[i] = ((rs1[i] + op2[i]) & {0xFFFF, 0xFF}) >> 1
Note: Immediate is zero-extended, shift is logical.

This issue relates to a recent issue from the cv32e40p core's github repo Issue:
[https://github.com/openhwgroup/cv32e40p/issues/814]

After this referenced issue's (#814) fix, the User Manual is now updated to clarify the cv.avgu.sci Imm6 field to be zero-extended, i.e., unsigned 6-bit Immediate. And the User Manual v1.3.2 now explicitly specifies the Imm6 field to be "Zero-extended" for cv.avgu.sci.h/b instructions.
But at this release version of the toolchain, this User Manual update is not yet reflected in the toolchain, and as a result if the Imm6 field notation used in assembly code is for unsigned 6-bit decimal beyond decimal 31, the toolchain flags this as out-of-range error.

Error Example:

cv.avgu.sci.b a0, t5, 43
Error: immediate value must be 6-bit signed, 43 is out of range

Maybe the team is already aware of this related User Manual update but just raising this issue here to track and notify.

CORE-V: All instructions in builtins are currently using lowercase

Instructions like the MAC instructions and SIMD Shuffle and Pack instructions use mixed case e.g. cv.muluRN rD, rs1, rs2, Is3 or cv.shuffleI1.sci.b rD, rs1, Imm6.

The current implementation uses lowercase.

Although you can implement instructions as mixed case the disassembler will output the instructions in lowercase.

I am raising this as an issue so that we can discuss a way to use mixed case instructions in the assembler and disassembler.

zcee needs zba and zbb

Hi, @jessicamills

In zcee, zext.* and sext.* 's instructions needs the B-extension subsets Zba and Zbb . I had not found them in corev-gcc .

I want to know how to solve this problem.

Thanks
Liao Shihua

Hardware loops causes ICE in final_scan_insn_1

Reproducer

File test.c (test.c.tar.gz)

char *a;
int b, c;
int
d ()
{
  char e = 0;
  for (; e < 6; e++)
    a[b] |= a[e] |= a[b] |= a[c + e >> 4 * b] |= 80 >> e + 1;
}

Run with

riscv32-corev-elf-gcc -march=rv32imac_zicsr_xcvhwlp -mabi=ilp32 -Os -c -o test.o test.c

Output is

test.c: In function 'd':
test.c:9:1: error: could not split insn
    9 | }
      | ^
(insn 70 77 54 (parallel [
            (set (reg:SI 70 lpstart0)
                (unspec:SI [
                        (label_ref:SI 54)
                    ] UNSPEC_CV_FOLLOWS))
            (set (reg:SI 71 lpend0)
                (unspec:SI [
                        (label_ref:SI 0)
                    ] UNSPEC_CV_LP_END_12))
            (set (reg:SI 72 lpcount0)
                (const_int 6 [0x6]))
            (clobber (reg:SI 15 a5 [204]))
        ]) "test.c":7:12 discrim 1 244 {doloop_begin_i}
     (expr_list:REG_UNUSED (reg:SI 15 a5 [204])
        (insn_list:REG_LABEL_OPERAND 54 (insn_list:REG_LABEL_OPERAND 0 (nil)))))
during RTL pass: final
test.c:9:1: internal compiler error: in final_scan_insn_1, at final.cc:2808
0x18b2248 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
	../../../gcc/gcc/rtl-error.cc:108
0x139f43e final_scan_insn_1
	../../../gcc/gcc/final.cc:2808
0x139f79d final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
	../../../gcc/gcc/final.cc:2887
0x139d4e5 final_1
	../../../gcc/gcc/final.cc:1979
0x13a266d rest_of_handle_final
	../../../gcc/gcc/final.cc:4240
0x13a29e4 execute
	../../../gcc/gcc/final.cc:4318
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

System information

Using built-in specs.
COLLECT_GCC=riscv32-corev-elf-gcc
COLLECT_LTO_WRAPPER=/home/jeremy/gittrees/dolphin/install/libexec/gcc/riscv32-corev-elf/14.0.0/lto-wrapper
Target: riscv32-corev-elf
Configured with: ../../gcc/configure --target=riscv32-corev-elf --prefix=/home/jeremy/gittrees/dolphin/install --with-sysroot=/home/jeremy/gittrees/dolphin/install/riscv32-corev-elf --with-native-system-header-dir=/include --with-newlib --disable-shared --enable-languages=c,c++ --enable-tls --disable-werror --disable-libmudflap --disable-libssp --disable-quadmath --disable-libgomp --disable-nls --enable-multilib --with-multilib-generator='rv32i-ilp32--c                                    rv32ia-ilp32--m                      	     rv32im-ilp32--c                      	     rv32if-ilp32f-rv32ifd-c              	     rv32iaf-ilp32f-rv32imaf,rv32iafc-d   	     rv32imf-ilp32f-rv32imfd-c            	     rv32iac-ilp32--                      	     rv32imac-ilp32--                     	     rv32imafc-ilp32f-rv32imafdc-         	     rv32ifd-ilp32d--c                    	     rv32imfd-ilp32d--c                   	     rv32iafd-ilp32d-rv32imafd,rv32iafdc- 	     rv32imafdc-ilp32d--' --with-arch=rv32imac --with-abi=ilp32 : (reconfigured) ../../gcc/configure --target=riscv32-corev-elf --prefix=/home/jeremy/gittrees/dolphin/install --with-sysroot=/home/jeremy/gittrees/dolphin/install/riscv32-corev-elf --with-native-system-header-dir=/include --with-newlib --disable-shared --enable-languages=c,c++ --enable-tls --disable-werror --disable-libmudflap --disable-libssp --disable-quadmath --disable-libgomp --disable-nls --enable-multilib --with-multilib-generator='rv32i-ilp32--c                                    rv32ia-ilp32--m                      	     rv32im-ilp32--c                      	     rv32if-ilp32f-rv32ifd-c              	     rv32iaf-ilp32f-rv32imaf,rv32iafc-d   	     rv32imf-ilp32f-rv32imfd-c            	     rv32iac-ilp32--                      	     rv32imac-ilp32--                     	     rv32imafc-ilp32f-rv32imafdc-         	     rv32ifd-ilp32d--c                    	     rv32imfd-ilp32d--c                   	     rv32iafd-ilp32d-rv32imafd,rv32iafdc- 	     rv32imafdc-ilp32d--' --with-arch=rv32imac --with-abi=ilp32
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230928 (experimental) (GCC)

How to submit zcee to openhw ?

Hello,
I had implemented zcee extension based on riscv-gcc-10.2.0 in last month, and how can I submit it into the designated branch of openhw ?

__builtin_riscv_cv_bitmanip_insert generate incorrect operand

Toolchain version: corev-openhw-gcc-centos7-20230504

Problem Statement:

In the generated code of __builtin_riscv_cv_bitmanip_insert, value of l2 and l3 is swapped. Initially, I thought it is an implementation issue, later I realized it looks more like a spec inconsistency issue. as you can see below. two different cv.insert templates are given for two documents.

  1. In corev instruction manuel
cv.insert rD, rs1, Is3, Is2 rD[min(Is3+Is2,31):Is2] = rs1[Is3-(max(Is3+Is2,31)-31):0]The rest of the bits of rD are untouched and keep their previous value.Is3 + Is2 must be < 32.
  1. in builtin spec

uint32_t __builtin_riscv_cv_bitmanip_insert (uint32_t i, uint16_t range, uint32_t k)
Case a) range is a constant and (range[9:5] + range [4:0]) <= 32

result, k: rD
i: rs1
range[4:0]: Is2 (5-bit unsigned value)
range[9:5]: Is3 (5-bit unsigned value)
or case b)

result, k: rD
i: rs1
range: rs2
Generated assembler:

Case a)

    cv.insert  rD,rs1,Is2,Is3

Extensions state save and restore

When using corev-openhw-gcc-centos7-20230331 toolchain with F extensions, all interrupt handlers blindly save and restore f registers without looking at Floating-Point state in MSTATUS register.
Moreover it does not save/restore full state as FCSR is missing.

In RISC-V Privilege specification, Machine Status Register (mstatus) section shows some fields for extension states:

  • FS for Floating-Point extension
  • XS for user-mode or custom extensions
  • VS for Vector extension
    and SD to summarize the dirtiness of all those 3 extensions.

When SD equals 1, this means that any of those 3 extensions has been used since they have been enabled or since the last context restore.
So FS/XS/VS should be tested to understand which extension is "Dirty" and needs whole state save and restore when changing the execution context (interrupt route, context switch, etc,...).

When SD is 0, no state needs to be saved/restored and interrupt latency/context switch/... is improved because 33 registers write in memory (generally stack) are not done. Same for the state restore at the end of the interrupt routine/context switch/...

A full description of the FS/XS/VS state transitions is described in Table 3.4

CORE-V: SIMD Constraints "CV6" and "CS6" needs renaming

The naming of the constraints is confusing.

Currently implemented as
CV6: Signed 6 bit constant
CS6: Unsigned 6 bit constant
The naming of constraints is tricky - 1 or 2 characters and 1 digit, so there isn't much choice. But using S for "Unsigned" seems to be asking for confusion.

The naming for all CORE-V predicates and constraints may need renaming to keep with consistency. Would be nice to come up with a rule for naming constraints so that it is universally understandable. This may also apply to renaming CORE-V operands to correspond to the instruction rather than order e.g. currently implemented b5 and b8 correspond to signed and unsigned 6-bit immediate.

illegal operand error for "sw" operation

affected the version pre-released at 20230814

/libs/dsp/src/TransformFunctions/dd_dct2_f32.c' -- 'dd_dct2_f32.o'
/tmp/ccAhyHfb.s: Assembler messages:
/tmp/ccAhyHfb.s:75: Error: illegal operands sw a5,(a6),4' /tmp/ccAhyHfb.s:90: Error: illegal operands sw a5,(a6),4'
/tmp/ccAhyHfb.s:106: Error: illegal operands sw a5,(a3),4' /tmp/ccAhyHfb.s:174: Error: illegal operands sw a0,(a3),4'

Latest release generates push/pop(from Zcmp extension) instructions the assembler finds illegal

When trying to simulate the interrupt_test for cv32e40s, the latest toolchain (corev-openhw-gcc-centos7-20221031) generates push/pop/popret instructions which in the assembler triggers an error:

image

steps to reproduce:

  1. Check out this branch of the core-v-verif repository
  2. Download the latest CORE-V toolchain from Embecosm(31/10/2022)
  3. Set the following environment variables:
    CV_SW_TOOLCHAIN=/[PATH]/corev-openhw-gcc-centos7-20221031
    CV_SW_PREFIX=riscv32-corev-elf-
  4. From /core-v-verif/cv32e40s/sim/uvmt/, run command
    make test TEST=interrupt_test SIMULATOR=[your sim] USE_ISS=NO

How should an out-of-range index be treated?

The testsuite for CoreV SIMD instructions uniformly tests out-of-range bit indices (-32 and 31) for cv.srl/sra/sll/extract/extractu/insert and treats them as valid code. However, the spec restricts them to a very narrow range, should we treat them as invalid or undefined behavior?

Builtin optimisation enhancement

The CORE-V builtins can be enhanced by expanding the rtl for each instruction. This would allow gcc to pattern match to these builtins.
More testing with a simulator would be required.

Added, untested with simulator:

  • XCVmac
  • XCValu
  • XCVelw
  • XCVbi
  • XCVmem
  • XCVbitmanip
  • XCVsimd
  • XCVhwlp

CORE-V bitmanip builtin failure with bclr/bset

Test program (cv-bitmanip-bug.c):

#include <stdint.h>
#include <stdlib.h>

volatile uint32_t src;

int main ()
{
  src = 0xc64a5933;

  if (__builtin_riscv_cv_bitmanip_bclr (src, 0x165) != src & 0xfffe001f)
    abort ();
}

Compile with:

riscv32-corev-elf-gcc cv-bitmanip-bug.c -O2 -march=rv32i_xcvbitmanip -mabi=ilp32 -specs=nano.specs -lm -Wl,-Tembecosm-link.ld -save-temps -ggdb3 -o ./cv-bitmanip-bug.exe

(where the linker script is in this case suitable for a FPGA X-HEEP CV32E40Pv2 image).

The code hits abort. Looking at the generated code, we see no use of the cv.bclr instruction:

Dump of assembler code for function main:
   0x00000228 <+0>:     lui     a4,0xc64a6
   0x0000022c <+4>:     addi    a4,a4,-1741 # 0xc64a5933
   0x00000230 <+8>:     sw      a4,-1936(gp)
   0x00000234 <+12>:    lw      a4,-1936(gp)
   0x00000238 <+16>:    lw      a5,-1936(gp)
   0x0000023c <+20>:    bnez    a5,0x248 <main+32>
   0x00000240 <+24>:    li      a0,0
   0x00000244 <+28>:    ret
   0x00000248 <+32>:    addi    sp,sp,-16
   0x0000024c <+36>:    sw      ra,12(sp)
   0x00000250 <+40>:    jal     0x3b0 <abort>

We observe that there is no problem if we do not set src, so this appears to be a misoptimization based on knowing the constant value in src and (wrongly) working out what the result of cv.bclr must be from its machine description.

The analogous problem occurs with cv.bset

pseudo instruction "LA" not translating to correct instructions

Toolchain release:

corev-openhw-gcc-centos7-20231128

Core:

CV32E40P v2 with PULP instuctions

Issue Description:

Running with this toolchain release, for some cases we see "LA" pseudo instruction does not translate into proper set of instructions:

Result with this toolchain version :
(NOT Expected)
test.S :

kernel_sp:
la x6, kernel_stack_end

Resulting elf/ test.objdump:

0000010c <kernel_sp>:
la x6, kernel_stack_end
10c: 7fc18313 addi x6,x3,2044 # 21704 <kernel_stack_end>

Only 1 instruction "addi" generated.

Whereas with previous toolchain release which I was using corev-openhw-gcc-centos7-20230905 :
(Expected)
The output for the same test was generated correctly as shown:

0000010c <kernel_sp>:
la x6, kernel_stack_end
10c: 00021317 auipc x6,0x21
110: 5f830313 addi x6,x6,1528 # 21704 <kernel_stack_end>

2 instructions auipc and addi generating and building correct label address.

Compilations command options used:

riscv32-corev-elf-gcc -DPULP
-Os -g -static -mabi=ilp32
-march=rv32imc_zicsr_zifencei_xcvhwlp1p0_xcvmem1p0_xcvmac1p0_xcvbi1p0_xcvalu1p0_xcvsimd1p0_xcvbitmanip1p0
-Wall -pedantic -nostartfiles -lcv-verif
/TOOLCHAIN/corev-public-gcc-centos7-20231128/bin/../lib/gcc/riscv32-corev-elf/14.0.0/../../../../riscv32-corev-elf/bin/ld

Steps to Reproduce :

Simulator questasim (QUESTA_2023.2_1)

Embecosm toolchain: corev-openhw-gcc-centos7-20231128

setenv SIMULATOR vsim

git clone https://github.com/XavierAubert/core-v-verif.git
cd core-v-verif
git checkout cv32e40p/dev_dd
(hash if required -> d6403a919f41c85185a86c3f75876727280f1842 )

cd cv32e40p/sim/uvmt/

make gen_corev-dv test TEST=corev_rand_pulp_hwloop_exception CFG=pulp TEST_CFG_FILE= SIMULATOR=vsim USE_ISS=yes COV=NO RUN_INDEX=1588395466 GEN_START_INDEX=1588395466 SEED=1588395466 CFG_PLUSARGS=+UVM_TIMEOUT=100000

Attached assembly and elf files:
LA_issue.zip

Conflict between spec and GCC of bit manipulation instructions

Take cv.bclr for example. The testcase cv-march-xcvbitmanip-compile-bclr.c has:

// source
// note that 200 == (6 << 5) + 8
  res1 = __builtin_riscv_cv_bitmanip_bclr (a, 200);
// check
/* { dg-final { scan-assembler-times "cv\.bclr\t\(\?\:t\[0-6\]\|a\[0-7\]\|s\[1-11\]\),\(\?\:t\[0-6\]\|a\[0-7\]\|s\[1-11\]\),6,8" 1 } } */

assumes the compiled instruction to be like cv.bclr a0, a1, 6, 8. That is, the lower 5 bits come after the higher 5 bits, or cv.bclr rD, rs1, range[9:5], range[4:0].

However in the OpenHW specification: CORE-V builtin names#PULP bit manipulation builtins (32-bit), the lower 5 bits come before the higher 5 bits:

uint32_t __builtin_riscv_cv_bitmanip_bclr (uint32_t i, uint16_t range)

Case a) range is a constant

  • result: rD
  • i: rs1
  • range[4:0]: Is2 (5-bit unsigned value)
  • range[9:5]: Is3 (5-bit unsigned value)

Generated assembler:

Case a)

        cv.bclr  rD,rs1,Is2,Is3

FYI, this could be due to a difference in the signature of cv.bclr, which is cv.bclr rD, rs1, Is3, Is2 in CORE-V Instruction Set Custom Extension#Bit Manipulation Operations where the order of Is3 and Is2 is reversed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.