FastNoiseSIMD | C SIMD Noise Library
kandi X-RAY | FastNoiseSIMD Summary
kandi X-RAY | FastNoiseSIMD Summary
FastNoiseSIMD
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of FastNoiseSIMD
FastNoiseSIMD Key Features
FastNoiseSIMD Examples and Code Snippets
Community Discussions
Trending Discussions on FastNoiseSIMD
QUESTION
Using VS2015 and compiling a library that has both SSE2 instructions and AVX2 instructions (that are only used if detected in the CPU), if I compile the library with /arch:AVX2
but only call the SSE2 instructions I get "illegal instruction" (on _mm_set1_epi32
first SSE2 instruction called). However, if I compile the lib with /arch:SSE2
it works fine when calling the SSE2 instructions.
Are the arch settings mutually exclusive? If not how should this be fixed? I have attempted both as a shared lib and static lib with the same issue.
this is the lib: https://github.com/Auburns/FastNoiseSIMD and there is an issue about it https://github.com/Auburns/FastNoiseSIMD/issues/20, although I don't think the related it directly to AVX2 being on and calling SSE2 instructions.
...ANSWER
Answered 2018-Oct-10 at 18:50If you build with /arch:AVX
or /arch:AVX2
, the primary impact is that all SSE code generated by the compiler will use the VEX prefix encoding which allows for more efficient scheduling of registers. If you run such code on a system without AVX or AVX2 support, it will in fact fault with an illegal instruction.
In other words, your use of _mm_set1_epi32
is an SSE2 instruction, but because you built with /arch:AVX2
it emitted those instructions using the VEX prefix. The /arch
switch impacts explicit intrinsics, compiler-generated floating-point math, the autovectorizer, etc.
If you want to support both 'stock' SSE/SSE2, AVX, and AVX2 platforms with optimized codepaths using the automatic generation supported by the /arch
switch, you need three different binaries (EXEs or DLLs).
See this blog post as well as this one
Note the main difference between /arch:AVX
and /arch:AVX2
is that the compiler will sometimes emit FMA3 instructions where the scheduler thinks it would be faster than a multiply then an add.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install FastNoiseSIMD
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page