c++ - g++ -O2 incorrectly optimize out SIMD variable assignment -
I am writing a program using Intel AVX 2 instructions. I found a bug in my program that appears with the optimization level-O2 or higher (plus-O1 is good). After detailed debugging, I reduce the buggy area. Now the bug seems to be compiler because it is wrong to customize a simple copy assignment of the variable __ m256i . Consider the following code snippet. Fu is a templated function I test with buggy line To verify my decision, I Both, i.e. Reason for which the compiler might remove assignment That line of code should be dead code so your Try it as an assignment To declare as CMP = kLess, OPT = kSet . I know the optimizer might optimize the switch. This variable can also be optimized for
y
y = m_lt; . When compiled with -O2, this line is being ignored . Then
y does not get the right value and the program generates false results. However, the program is correct -O1
y = m_lt; with two options:
y = avx_or (m_lt, avx_zero ()); BitWord or
m_lt and all-0s vector
y = _mm256_load_si256 (& amp; m_lt); Use the CID Load Instructions to load data from
m_lt address.
y = m_lt; is intended to prevent some customization by adding some functions. This program works well with both these replacements under all customization levels. So the problem is weird for the sake of my knowledge, the direct functioning of the CID variable is definitely fine (I used it long ago). Will this be a problem related to the compiler?
typed __M 256i AVXUNIT; Template & lt; Comparator CMP, Bitwise OPT & gt; Zero Foo () {AvxUnit m_lt; // ... emphasis (! Avx_iszero (m_lt)); // Always pass AVXUnit y; Switch (CMP) {Case comparator :: Acquire: y = m_eq; break; Case comparator :: knenequal: y = avx_not (m_eq); break; Case comparator :: of Las: y = m_lt; // ********** bug? ************ / y = AVX_OR (M_LT, AX_Jerro ()); // Replace with this line is good. // y = _mm256_load_se256 (& amp; m_lt); // Change with this line is also very good breakdown; Case comparator :: kGreater: y = m_gt; break; Case comparator :: kLessEqual: y = avx_or (m_lt, m_eq); break; Case comparator :: kGreaterEqual: y = avx_or (m_gt, m_eq); break; } Switch (opt) {case beat: cassette: break; Case Beaty: K: y = Ax_and (Y, Biwblock-> Gatevc.UNIT (BV_WORD_ID)); break; Case Bitwise :: Core: y = Ax_Cor (Y, BVBlock-> GateVWUNNIT (BV_WORD_ID)); break; } Emphasize (! Avx_iszero (y)); // with Pass-O1, failed with -O2 or higher- bvblock-> SetAvxUnit (y, bv_word_id); // ...}
CMP is not possible
Comparator :: kLess .
__ asm__ volatile statement and they can not be optimized.
m_lt the volatile will probably not affect your performance but it is a filthy hack to fix it. I look more at the
CMP variable and see if it can also take the
value of
.
Comments
Post a Comment