Trevor Gross 8995ac0448 Use runtime feature detection for fma routines on x86
Get performance closer to the glibc implementations by adding assembly
fma routines, with runtime feature detection so they are used even if
not compiled with `+fma` (as the distributed standard library is often
not). Glibc uses ifuncs, this implementation stores a function pointer
in an atomic.

Results of CPU flags are also cached in order to avoid repeating the
startup time in calls to different functions. The feature detection code
is a slightly simplified version of `std-detect`.

Musl sources were used as a reference [1].

Fixes: https://github.com/rust-lang/rust/issues/140452 once synced

[1]: c47ad25ea3/src/math/x32/fma.c
2025-05-03 14:17:49 -04:00
..