Comments (21)
Added to sljit zherczeg/sljit@56dbde0
from pcre2.
AVX2 is not used unless CPUID says they are available.
from pcre2.
Well, normally AVX is available for skylake, but when the GDS mitigation is forced, it's not. So this is kind of a special case - a skylake CPU without AVX instructions. Do you think this special case can be handled somehow?
from pcre2.
What cpuid says on that system?
https://github.com/zherczeg/sljit/blob/master/sljit_src/sljitNativeX86_common.c#L512
Are these bits set? If cpu says it supports AVX, the software believes it.
We could add a compile time option to disable it, but you need to pass it manually.
from pcre2.
cpuid_tool (from libcpuid) says avx and avx2 are available, but lscpu | grep avx
produces no output. So cpuid is wrong in this case.
cpuid_tool_report.txt
lscpu.txt
A compile time flag would work for me, but I don't think it's a good solution, because not everybody with an affected CPU wants to recompile pcre2 and if Linux distros deliver pcre2 with jit enabled without the flag, the software would still crash for them.
from pcre2.
What is a fairly easy and good solution then? It is surprising the cpu lies about the feature.
from pcre2.
It is surprising the cpu lies about the feature.
In addition to the CPU reporting it supports AVX support the OS must also support it. You can check for the support by querying bit 1 and 2 of the XCR0 register using the xgetbv instruction as documented in the Intel® 64 and IA-32 Architectures Software Developer’s Manual.
Edit:
See example code in:
https://www.intel.com/content/dam/develop/external/us/en/documents/how-to-detect-new-instruction-support-in-the-4th-generation-intel-core-processor-family.pdf
The following diff might be closer to your use case https://gitlab.gnome.org/GNOME/babl/-/issues/96#note_2056377
from pcre2.
This is something new for me, although simple enough to use it. How can I test if this instruction is available?
Is there a description about its feature bits? I only found this:
https://en.wikipedia.org/wiki/Control_register#XCR0_and_XSS
This documentation tells that xgetbv has a non-deterministic behaviour:
https://google.github.io/security-research/pocs/cpus/xgetbv/
from pcre2.
See the Intel SDM Volume 1 Chapter 13 Managing State Using The XSAVE Feature Set (includes XGETBV and XCR0) and
Volume 1 Chapter 14.3 Dectection OF Intel® AVX Instructions.
from pcre2.
I have no time to read this much info. I only need the following info:
- How can I detect if the instruction is available?
- Is it supported by non-intel x86 cpus?
- The bitlist of XCR0 (I have already found it)
from pcre2.
* How can I detect if the instruction is available?
CPUID.(EAX=01H, ECX=0H):ECX[bit 27] indicates OSXSAVE support it also indirectly implies the processor supports XSAVE, XRSTOR, XGETBV, and XCR0.
* Is it supported by non-intel x86 cpus?
Yes as the interface is standardized.
Edit:
So something like:
diff --git a/src/sljit/sljitNativeX86_common.c b/src/sljit/sljitNativeX86_common.c
index c2c0421..c15299b 100644
--- a/src/sljit/sljitNativeX86_common.c
+++ b/src/sljit/sljitNativeX86_common.c
@@ -392,6 +392,7 @@ static const sljit_u8 freg_lmap[SLJIT_NUMBER_OF_FLOAT_REGISTERS + 2] = {
#define CPU_FEATURE_CMOV 0x020
#define CPU_FEATURE_AVX 0x040
#define CPU_FEATURE_AVX2 0x080
+#define CPU_FEATURE_OSXSAVE 0x010
static sljit_u32 cpu_feature_list = 0;
@@ -499,6 +500,23 @@ __msan_unpoison(info, 4 * sizeof(sljit_u32));
}
+/*
+0 FPU
+1 SSE XMM registers
+2 AVX YMM registers
+*/
+static sljit_u32 get_xcr0(void)
+{
+ sljit_u32 xcr0;
+ __asm__ __volatile__(".byte 0x0f, 0x01, 0xd0" : "=a" (xcr0) : "c" (0) : "%edx");
+ return xcr0;
+}
+
+static sljit_u32 get_os_support_avx(void)
+{
+ return ((get_xcr0() & 6) == 6);
+}
+
static void get_cpu_features(void)
{
sljit_u32 feature_list = CPU_FEATURE_DETECTED;
@@ -526,6 +544,8 @@ static void get_cpu_features(void)
if (info[2] & 0x80000)
feature_list |= CPU_FEATURE_SSE41;
+ if (info[2] & 0x8000000)
+ feature_list |= CPU_FEATURE_OSXSAVE;
if (info[2] & 0x10000000)
feature_list |= CPU_FEATURE_AVX;
#if (defined SLJIT_DETECT_SSE2 && SLJIT_DETECT_SSE2)
@@ -543,6 +563,9 @@ static void get_cpu_features(void)
if (info[2] & 0x20)
feature_list |= CPU_FEATURE_LZCNT;
+ if ((feature_list & CPU_FEATURE_AVX) && !((feature_list & CPU_FEATURE_OSXSAVE) || get_os_support_avx()))
+ feature_list = feature_list &= ~(CPU_FEATURE_AVX | CPU_FEATURE_AVX2);
+
cpu_feature_list = feature_list;
}
from pcre2.
Edit: See example code in: https://www.intel.com/content/dam/develop/external/us/en/documents/how-to-detect-new-instruction-support-in-the-4th-generation-intel-core-processor-family.pdf
I tried out the example code, removing all the features in line 8, leaving only _FEATURE_AVX2
and it outputs:
This CPU does not support all ISA extensions introduced in Haswell
So this should work.
from pcre2.
Ok I will do this in the sljit project in due course. It will be transferred into pcre2 before the next release.
from pcre2.
I applied logs' patch to 10.43 and it compiled successfully, but when I run make -j1 check
the checks still fail with 'illegal instruction'. I'm compiling like Arch Linux does in the 10.43 PKGBUILD.
from pcre2.
I made a pcre2 package for Arch Linux with the patch applied (commenting out the checks) and installed it and now qpwgraph is crashing from the start. Without the patch it was crashing when I clicked 'Open' or 'Save as' (before a file dialog is opened). I also recompiled qpwgraph, just to be sure. Here's the backtrace of qpwgraph with the patch applied to pcre2:
qpwgraph_gdb_backtrace_patch_applied.txt
from pcre2.
It looks like it is not in the jit-ed code, but during compilation: pcre2_compile_16. Could you disassemble the instructions around it:
disassemble $pc-32,$pc+32
Maybe it is the XGETBV instruction.
from pcre2.
Thread 1 "qpwgraph" received signal SIGILL, Illegal instruction.
0x00007ffff4d8dfb0 in pcre2_compile_16 (pattern=0x5555558bac60, patlen=<optimized out>, options=524288, errorptr=0x55555588c2e0, erroroffset=0x7fffffffd988,
ccontext=0x7ffff4e11c00 <_pcre2_default_compile_context_16>) at src/pcre2_compile.c:10267
10267 cb.start_pattern = pattern;
(gdb) disassemble $pc-32,$pc+32
Dump of assembler code from 0x7ffff4d8df90 to 0x7ffff4d8dfd0:
0x00007ffff4d8df90 <pcre2_compile_16+304>: xor %cl,(%rdi)
0x00007ffff4d8df92 <pcre2_compile_16+306>: (bad)
0x00007ffff4d8df93 <pcre2_compile_16+307>: xor $0x4a000005,%eax
0x00007ffff4d8df98 <pcre2_compile_16+312>: lea (%rbx,%rsi,2),%edx
0x00007ffff4d8df9b <pcre2_compile_16+315>: mov %rdx,-0x4688(%rbp)
0x00007ffff4d8dfa2 <pcre2_compile_16+322>: mov -0x4690(%rbp),%rsi
0x00007ffff4d8dfa9 <pcre2_compile_16+329>: lea 0x5c190(%rip),%rdx # 0x7ffff4dea140 <_pcre2_default_tables_16>
=> 0x00007ffff4d8dfb0 <pcre2_compile_16+336>: vmovq %rbx,%xmm3
0x00007ffff4d8dfb5 <pcre2_compile_16+341>: mov -0x46a0(%rbp),%edi
0x00007ffff4d8dfbb <pcre2_compile_16+347>: movq $0x0,-0x44e8(%rbp)
0x00007ffff4d8dfc6 <pcre2_compile_16+358>: mov 0x28(%rsi),%rcx
0x00007ffff4d8dfca <pcre2_compile_16+362>: vmovq %rsi,%xmm2
0x00007ffff4d8dfcf <pcre2_compile_16+367>: mov %edi,-0x44cc(%rbp)
End of assembler dump.
from pcre2.
The following diff might be closer to your use case https://gitlab.gnome.org/GNOME/babl/-/issues/96#note_2056377
I tested this babl patch and it fixes the babl and gimp crash, which is unrelated to pcre2, but the logic there seems to solve the problem.
from pcre2.
the name vmovq
is a VEX prefixed version of movq
, so your compiler automatically generates AVX2 instructions. You need to set the instruction set (or the cpu type) before compiling.
from pcre2.
I built the packages on Arch Linux in a chroot with makechrootpkg -uncCr $chrootdir -- -rCc
. This basically runs makepkg -rCc
in a chroot and makepkg has a config file /etc/makepkg.conf, where environment variables like CFLAGS
with compiler and linker options are set. I didn't change these options in the chroot, because I thought it would be good to see if it works with the default options on Arch. If I understand correctly, then it shouldn't use the VEX prefixed instructions like vmovq, when the compiler options -march=x86-64 -mtune=generic
are set. Is this correct?
On the host I found that the /etc/makepkg.conf had -march=native
instead of -march=x86-64 -mtune=generic
set. Could it be that I accidentally built the package on the host with -march=native
and this was the cause for the crash on start?
Strangely, I couldn't reproduce it today for qpwgraph. Even when I build both packages with the -march=native
config, it doesn't crash on start anymore and we're back to the crash when I click on 'Open' or 'Save as' behaviour (with the patch applied).
Backtrace and disassemble of crash after clicking 'Open' with the patch applied:
qpwgraph_gdb_backtrace_disass_patch_applied.txt
Build process of the pcre2 package with the patch applied:
from pcre2.
Looks good! No more crashes. Tested sljit first and without the new patch I get 'illegal instruction' from the ./bin/sljit_test
; with the new patch applied I get a PASS. And I made a pcre2 package for Arch Linux with the new patch and it fixes all the crashes. Thanks a lot!
from pcre2.
Related Issues (20)
- Loss of data warnings on Visual Studio HOT 3
- Website problem: http to https redirections do not configurated HOT 2
- Several old links on pcre.org website
- Old files on pcre.org HOT 1
- MSVC warnings with 10.43 HOT 1
- Allow unlimited subpattern name length HOT 4
- SunOS-5.11-SPARC - "src/pcre2.h", line 949: warning: no explicit type given HOT 5
- Invalid size 0 may lead to undefined behavior / infinite loop HOT 1
- Atomic group must not increase stack depth HOT 1
- Probable thread-safety issue in pcre2 10.43+ HOT 9
- No load/store-on-condition 2 facility HOT 4
- How can I get PCRE with version under 8.45? HOT 1
- Lift compile time default maximum pattern length HOT 1
- Support W3C xpath spec. weird newline behavior regarding to DOTALL and MULTILINE HOT 6
- Grapheme cluser (`\X`) selector capturing multiple character HOT 2
- Alternative branch should match shared prefix only once HOT 3
- Might be a problem found during the metamorphosis test HOT 4
- Test suite fails when targeting i686 HOT 7
- Signing of git objects HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pcre2.