Why Does -march=native Cause Crashes on My Zen2 Chip?

  • Thread starter Thread starter Vanadium 50
  • Start date Start date
AI Thread Summary
The discussion revolves around the behavior of the GCC compiler's march and mtune options on a Zen2 chip. It highlights that mtune appears to have minimal impact on performance, especially when using aggressive optimization flags like -O3 and -Ofast, suggesting that the code may already be highly optimized. The default march option works adequately, and specific settings like -march=znver1 and -march=native do not yield significant performance improvements, with both potentially causing hangs during pthread_create. The user notes that the march=native setting should not produce incompatible code for the CPU. The inquiry into performance stems from observing that a specific block of code, involving multiply-and-add operations, could benefit from FMA instructions, prompting the exploration of compiler switches for optimization. Overall, the findings indicate that under certain conditions, the compiler's tuning options may not lead to noticeable performance gains.
Vanadium 50
Staff Emeritus
Science Advisor
Education Advisor
Gold Member
Messages
35,003
Reaction score
21,703
I don't understand the march/mtune behavior.

This is Linux GCC, on a Zen2 chip.

Mtune seems to do nothing much. OK, sometimes your code is as tuned as its going to get out of the box. I am compiling with -O3 and -Ofast, so maybe it is so optimized there is little to tune.

Default march works fine. -march=znver1 works fine, but again, no faster. OK, again, maybe there's little to be done. -march=native and -march=znver2 should do the same thing. I guess they do. They both hang at pthread_create.

This is not really a problem - I don't really need to squeeze the last bin of performance out of the code - but it sure seems mysterious.

gcc -v gives

Code:
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-host-pie --enable-host-bind-now --enable-languages=c,c++,fortran,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugs.almalinux.org/ --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --enable-plugin --enable-initfini-array --without-isl --enable-multilib --with-linker-hash-style=gnu --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_64=x86-64-v2 --with-arch_32=x86-64 --build=x86_64-redhat-linux --with-build-config=bootstrap-lto --enable-link-serialization=1
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.4.1 20231218 (Red Hat 11.4.1-3) (GCC)
 
Computer science news on Phys.org
Vanadium 50 said:
Mtune seems to do nothing much. OK, sometimes your code is as tuned as its going to get out of the box. I am compiling with -O3 and -Ofast, so maybe it is so optimized there is little to tune.
If you leave out those optimization options, does mtune seem to do more?
 
I have not played with that. I am more interested in my march crashes. The setting march=native should never generate code that the CPU cannot handle.

The way I got into this rabbit hole was noticing that the block of code that takes the longest has some multiply-and-adds. This was to see if the compiler could speed it up by using FMA instructions. What could be simpler than throwing a compiler switch?
 
I came across a video regarding the use of AI/ML to work through complex datasets to determine complicated protein structures. It is a promising and beneficial use of AI/ML. AlphaFold - The Most Useful Thing AI Has Ever Done https://www.ebi.ac.uk/training/online/courses/alphafold/an-introductory-guide-to-its-strengths-and-limitations/what-is-alphafold/ https://en.wikipedia.org/wiki/AlphaFold https://deepmind.google/about/ Edit/update: The AlphaFold article in Nature John Jumper...
Thread 'Urgent: Physically repair - or bypass - power button on Asus laptop'
Asus Vivobook S14 flip. The power button is wrecked. Unable to turn it on AT ALL. We can get into how and why it got wrecked later, but suffice to say a kitchen knife was involved: These buttons do want to NOT come off, not like other lappies, where they can snap in and out. And they sure don't go back on. So, in the absence of a longer-term solution that might involve a replacement, is there any way I can activate the power button, like with a paperclip or wire or something? It looks...
Back
Top