GithubHelp home page GithubHelp logo

Comments (21)

zherczeg avatar zherczeg commented on July 18, 2024 4

Added to sljit zherczeg/sljit@56dbde0

from pcre2.

zherczeg avatar zherczeg commented on July 18, 2024

AVX2 is not used unless CPUID says they are available.

from pcre2.

bugreporter234 avatar bugreporter234 commented on July 18, 2024

Well, normally AVX is available for skylake, but when the GDS mitigation is forced, it's not. So this is kind of a special case - a skylake CPU without AVX instructions. Do you think this special case can be handled somehow?

from pcre2.

zherczeg avatar zherczeg commented on July 18, 2024

What cpuid says on that system?

https://github.com/zherczeg/sljit/blob/master/sljit_src/sljitNativeX86_common.c#L512

Are these bits set? If cpu says it supports AVX, the software believes it.

We could add a compile time option to disable it, but you need to pass it manually.

from pcre2.

bugreporter234 avatar bugreporter234 commented on July 18, 2024

cpuid_tool (from libcpuid) says avx and avx2 are available, but lscpu | grep avx produces no output. So cpuid is wrong in this case.

cpuid_tool_report.txt
lscpu.txt

A compile time flag would work for me, but I don't think it's a good solution, because not everybody with an affected CPU wants to recompile pcre2 and if Linux distros deliver pcre2 with jit enabled without the flag, the software would still crash for them.

from pcre2.

zherczeg avatar zherczeg commented on July 18, 2024

What is a fairly easy and good solution then? It is surprising the cpu lies about the feature.

from pcre2.

loqs avatar loqs commented on July 18, 2024

It is surprising the cpu lies about the feature.

In addition to the CPU reporting it supports AVX support the OS must also support it. You can check for the support by querying bit 1 and 2 of the XCR0 register using the xgetbv instruction as documented in the Intel® 64 and IA-32 Architectures Software Developer’s Manual.

Edit:
See example code in:
https://www.intel.com/content/dam/develop/external/us/en/documents/how-to-detect-new-instruction-support-in-the-4th-generation-intel-core-processor-family.pdf

The following diff might be closer to your use case https://gitlab.gnome.org/GNOME/babl/-/issues/96#note_2056377

from pcre2.

zherczeg avatar zherczeg commented on July 18, 2024

This is something new for me, although simple enough to use it. How can I test if this instruction is available?

Is there a description about its feature bits? I only found this:
https://en.wikipedia.org/wiki/Control_register#XCR0_and_XSS

This documentation tells that xgetbv has a non-deterministic behaviour:
https://google.github.io/security-research/pocs/cpus/xgetbv/

from pcre2.

loqs avatar loqs commented on July 18, 2024

See the Intel SDM Volume 1 Chapter 13 Managing State Using The XSAVE Feature Set (includes XGETBV and XCR0) and
Volume 1 Chapter 14.3 Dectection OF Intel® AVX Instructions.

from pcre2.

zherczeg avatar zherczeg commented on July 18, 2024

I have no time to read this much info. I only need the following info:

  • How can I detect if the instruction is available?
  • Is it supported by non-intel x86 cpus?
  • The bitlist of XCR0 (I have already found it)

from pcre2.

loqs avatar loqs commented on July 18, 2024
* How can I detect if the instruction is available?

CPUID.(EAX=01H, ECX=0H):ECX[bit 27] indicates OSXSAVE support it also indirectly implies the processor supports XSAVE, XRSTOR, XGETBV, and XCR0.

* Is it supported by non-intel x86 cpus?

Yes as the interface is standardized.

Edit:
So something like:

diff --git a/src/sljit/sljitNativeX86_common.c b/src/sljit/sljitNativeX86_common.c
index c2c0421..c15299b 100644
--- a/src/sljit/sljitNativeX86_common.c
+++ b/src/sljit/sljitNativeX86_common.c
@@ -392,6 +392,7 @@ static const sljit_u8 freg_lmap[SLJIT_NUMBER_OF_FLOAT_REGISTERS + 2] = {
 #define CPU_FEATURE_CMOV		0x020
 #define CPU_FEATURE_AVX			0x040
 #define CPU_FEATURE_AVX2		0x080
+#define CPU_FEATURE_OSXSAVE		0x010
 
 static sljit_u32 cpu_feature_list = 0;
 
@@ -499,6 +500,23 @@ __msan_unpoison(info, 4 * sizeof(sljit_u32));
 
 }
 
+/*
+0    FPU
+1    SSE    XMM registers
+2    AVX    YMM registers
+*/
+static sljit_u32 get_xcr0(void)
+{
+	sljit_u32 xcr0;
+	__asm__ __volatile__(".byte 0x0f, 0x01, 0xd0" : "=a" (xcr0) : "c" (0) : "%edx");
+	return xcr0;
+}
+
+static sljit_u32 get_os_support_avx(void)
+{
+	return ((get_xcr0() & 6) == 6);
+}
+
 static void get_cpu_features(void)
 {
 	sljit_u32 feature_list = CPU_FEATURE_DETECTED;
@@ -526,6 +544,8 @@ static void get_cpu_features(void)
 
 		if (info[2] & 0x80000)
 			feature_list |= CPU_FEATURE_SSE41;
+		if (info[2] & 0x8000000)
+			feature_list |= CPU_FEATURE_OSXSAVE;
 		if (info[2] & 0x10000000)
 			feature_list |= CPU_FEATURE_AVX;
 #if (defined SLJIT_DETECT_SSE2 && SLJIT_DETECT_SSE2)
@@ -543,6 +563,9 @@ static void get_cpu_features(void)
 	if (info[2] & 0x20)
 		feature_list |= CPU_FEATURE_LZCNT;
 
+	if ((feature_list & CPU_FEATURE_AVX) && !((feature_list & CPU_FEATURE_OSXSAVE) || get_os_support_avx()))
+		feature_list  = feature_list &= ~(CPU_FEATURE_AVX | CPU_FEATURE_AVX2);
+
 	cpu_feature_list = feature_list;
 }

from pcre2.

bugreporter234 avatar bugreporter234 commented on July 18, 2024

Edit: See example code in: https://www.intel.com/content/dam/develop/external/us/en/documents/how-to-detect-new-instruction-support-in-the-4th-generation-intel-core-processor-family.pdf

I tried out the example code, removing all the features in line 8, leaving only _FEATURE_AVX2 and it outputs:

This CPU does not support all ISA extensions introduced in Haswell

So this should work.

from pcre2.

zherczeg avatar zherczeg commented on July 18, 2024

Ok I will do this in the sljit project in due course. It will be transferred into pcre2 before the next release.

from pcre2.

bugreporter234 avatar bugreporter234 commented on July 18, 2024

I applied logs' patch to 10.43 and it compiled successfully, but when I run make -j1 check the checks still fail with 'illegal instruction'. I'm compiling like Arch Linux does in the 10.43 PKGBUILD.

test-suite.log

from pcre2.

bugreporter234 avatar bugreporter234 commented on July 18, 2024

I made a pcre2 package for Arch Linux with the patch applied (commenting out the checks) and installed it and now qpwgraph is crashing from the start. Without the patch it was crashing when I clicked 'Open' or 'Save as' (before a file dialog is opened). I also recompiled qpwgraph, just to be sure. Here's the backtrace of qpwgraph with the patch applied to pcre2:

qpwgraph_gdb_backtrace_patch_applied.txt

from pcre2.

zherczeg avatar zherczeg commented on July 18, 2024

It looks like it is not in the jit-ed code, but during compilation: pcre2_compile_16. Could you disassemble the instructions around it:
disassemble $pc-32,$pc+32

Maybe it is the XGETBV instruction.

from pcre2.

bugreporter234 avatar bugreporter234 commented on July 18, 2024
Thread 1 "qpwgraph" received signal SIGILL, Illegal instruction.
0x00007ffff4d8dfb0 in pcre2_compile_16 (pattern=0x5555558bac60, patlen=<optimized out>, options=524288, errorptr=0x55555588c2e0, erroroffset=0x7fffffffd988, 
    ccontext=0x7ffff4e11c00 <_pcre2_default_compile_context_16>) at src/pcre2_compile.c:10267
10267	cb.start_pattern = pattern;
(gdb) disassemble $pc-32,$pc+32
Dump of assembler code from 0x7ffff4d8df90 to 0x7ffff4d8dfd0:
   0x00007ffff4d8df90 <pcre2_compile_16+304>:	xor    %cl,(%rdi)
   0x00007ffff4d8df92 <pcre2_compile_16+306>:	(bad)
   0x00007ffff4d8df93 <pcre2_compile_16+307>:	xor    $0x4a000005,%eax
   0x00007ffff4d8df98 <pcre2_compile_16+312>:	lea    (%rbx,%rsi,2),%edx
   0x00007ffff4d8df9b <pcre2_compile_16+315>:	mov    %rdx,-0x4688(%rbp)
   0x00007ffff4d8dfa2 <pcre2_compile_16+322>:	mov    -0x4690(%rbp),%rsi
   0x00007ffff4d8dfa9 <pcre2_compile_16+329>:	lea    0x5c190(%rip),%rdx        # 0x7ffff4dea140 <_pcre2_default_tables_16>
=> 0x00007ffff4d8dfb0 <pcre2_compile_16+336>:	vmovq  %rbx,%xmm3
   0x00007ffff4d8dfb5 <pcre2_compile_16+341>:	mov    -0x46a0(%rbp),%edi
   0x00007ffff4d8dfbb <pcre2_compile_16+347>:	movq   $0x0,-0x44e8(%rbp)
   0x00007ffff4d8dfc6 <pcre2_compile_16+358>:	mov    0x28(%rsi),%rcx
   0x00007ffff4d8dfca <pcre2_compile_16+362>:	vmovq  %rsi,%xmm2
   0x00007ffff4d8dfcf <pcre2_compile_16+367>:	mov    %edi,-0x44cc(%rbp)
End of assembler dump.

from pcre2.

bugreporter234 avatar bugreporter234 commented on July 18, 2024

The following diff might be closer to your use case https://gitlab.gnome.org/GNOME/babl/-/issues/96#note_2056377

I tested this babl patch and it fixes the babl and gimp crash, which is unrelated to pcre2, but the logic there seems to solve the problem.

from pcre2.

zherczeg avatar zherczeg commented on July 18, 2024

the name vmovq is a VEX prefixed version of movq, so your compiler automatically generates AVX2 instructions. You need to set the instruction set (or the cpu type) before compiling.

from pcre2.

bugreporter234 avatar bugreporter234 commented on July 18, 2024

I built the packages on Arch Linux in a chroot with makechrootpkg -uncCr $chrootdir -- -rCc. This basically runs makepkg -rCc in a chroot and makepkg has a config file /etc/makepkg.conf, where environment variables like CFLAGS with compiler and linker options are set. I didn't change these options in the chroot, because I thought it would be good to see if it works with the default options on Arch. If I understand correctly, then it shouldn't use the VEX prefixed instructions like vmovq, when the compiler options -march=x86-64 -mtune=generic are set. Is this correct?

On the host I found that the /etc/makepkg.conf had -march=native instead of -march=x86-64 -mtune=generic set. Could it be that I accidentally built the package on the host with -march=native and this was the cause for the crash on start?

Strangely, I couldn't reproduce it today for qpwgraph. Even when I build both packages with the -march=native config, it doesn't crash on start anymore and we're back to the crash when I click on 'Open' or 'Save as' behaviour (with the patch applied).

Backtrace and disassemble of crash after clicking 'Open' with the patch applied:

qpwgraph_gdb_backtrace_disass_patch_applied.txt

Build process of the pcre2 package with the patch applied:

pcre2-build.txt

from pcre2.

bugreporter234 avatar bugreporter234 commented on July 18, 2024

Looks good! No more crashes. Tested sljit first and without the new patch I get 'illegal instruction' from the ./bin/sljit_test; with the new patch applied I get a PASS. And I made a pcre2 package for Arch Linux with the new patch and it fixes all the crashes. Thanks a lot!

from pcre2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.