The `rustc` probe done in our build scripts needs to pass `--target` to
get the correct configuration, which usually comes from the `TARGET`
environment variable. However, for targets specified via a `target.json`
file, `TARGET` gets set to the file name without an extension or path.
`rustc` will check a search path to attempt to locate the file, but this
is likely to fail since the directory where Cargo invokes build scripts
(and hence where those scripts invoke `rustc`) might not have any
relation to the JSON spec file.
Resolve this for now by leaving `f16` and `f128` disabled if the `rustc`
command fails. Result of the discussion at CARGO-14208 may eventually
provide a better solution.
A CI test is also added since custom JSON files are an edge case that
could fail in other ways. I verified this fails without the fix here.
The JSON file is the output for `thumbv7em-none-eabi`, just renamed so
`rustc` doesn't identify it.
Currently we whether or not to build and test `f16` and `f128` support
mostly based on the target triple. This isn't always accurate, however,
since support also varies by backend and the backend version.
Since recently, `rustc` is aware of this with the unstable config option
`target_has_reliable_{f16,f128}`, which better represents when the types
are actually expected to be available and usable. Switch our
compiler-builtins and libm configuration to use this by probing `rustc`
for the target's settings.
A few small `cfg` fixes are needed with this.
`i256` and `u256`
- operators now use the same overflow convention as primitives
- implement `<<` and `-` (previously just `>>` and `+`)
- implement `Ord` correctly (the previous `PartialOrd` was broken)
- correct `i256::SIGNED` to `true`
The `Int`-trait is extended with `trailing_zeros`, `carrying_add`, and
`borrowing_sub`.
Use a consistent ordering for top-level manifest keys, and remove those
that are now redundant (`homapage` isn't supposed to be the same as
`repository`, and `documentation` automatically points to docs.rs now).
After adding tests, the current implementation for fminimum fails when
provided a negative zero and NaN as inputs:
---- math::fminimum_fmaximum_num::tests::fmaximum_num_spec_tests_f64 stdout ----
thread 'math::fminimum_fmaximum_num::tests::fmaximum_num_spec_tests_f64' panicked at libm/src/math/fminimum_fmaximum_num.rs:240:13:
fmaximum_num(-0x0p+0, NaN)
l: NaN (0x7ff8000000000000)
r: -0.0 (0x8000000000000000)
---- math::fminimum_fmaximum_num::tests::fmaximum_num_spec_tests_f32 stdout ----
thread 'math::fminimum_fmaximum_num::tests::fmaximum_num_spec_tests_f32' panicked at libm/src/math/fminimum_fmaximum_num.rs:240:13:
fmaximum_num(-0x0p+0, NaN)
l: NaN (0x7fc00000)
r: -0.0 (0x80000000)
Add more thorough spec tests for these functions and correct the
implementations.
Canonicalization is also moved to a trait method to centralize
documentation about what it does and doesn't do.
`binop_common` emits a `SKIP` that is intended to apply only to
`copysign`, but is instead applying to all binary operators. Correct the
general case but leave the currently-failing `maximum_num` tests as a
FIXME, to be resolved separately in [1].
Also simplify skip logic and NaN checking, and add a few more `copysign`
checks.
[1]: https://github.com/rust-lang/compiler-builtins/pull/939
As seen at [1], LLVM uses `long long` on LLP64 (to get a 64-bit integer
matching pointer size) and `long` on everything else, with exceptions
for AArch64 and AVR. Our current logic always uses an `i32`. This
happens to work because LLVM uses 32-bit instructions to check the
output on x86-64, but the GCC checks the full 64-bit register so garbage
in the upper half leads to incorrect results.
Update our return type to be `isize`, with exceptions for AArch64 and
AVR.
Fixes: https://github.com/rust-lang/compiler-builtins/issues/919
[1]: 0cf3c437c1/compiler-rt/lib/builtins/fp_compare_impl.inc (L11-L27)
These were deleted during refactoring in 0a2dc5d9 ("Combine the source
files for more generic implementations") but got added back by accident
in 54bac411 ("refactor: Move the libm crate to a subdirectory"). Remove
them again here.
The `feature_detect` module is currently being built on all targets, but
the use of `AtomicU32` causes a problem if atomics are not available
(such as with `bpfel-unknown-none`). Gate this module behind
`target_has_atomic = "ptr"`.
The below now completes successfully:
cargo build -p compiler_builtins --target=bpfel-unknown-none -Z build-std=core
Fixes: https://github.com/rust-lang/compiler-builtins/issues/908
Get performance closer to the glibc implementations by adding assembly
fma routines, with runtime feature detection so they are used even if
not compiled with `+fma` (as the distributed standard library is often
not). Glibc uses ifuncs, this implementation stores a function pointer
in an atomic.
Results of CPU flags are also cached in order to avoid repeating the
startup time in calls to different functions. The feature detection code
is a slightly simplified version of `std-detect`.
Musl sources were used as a reference [1].
Fixes: https://github.com/rust-lang/rust/issues/140452 once synced
[1]: c47ad25ea3/src/math/x32/fma.c
These appeared in a later nightly. In compiler-builtins we can apply the
suggestion, but in `libm` we need to ignore them since `fx::from_bits`
is not `const` at the MSRV.
`clippy::uninlined_format_args` also seems to have gotten stricter, so
fix those here.
Use the 2024 style edition for all crates and enable import sorting.
2024 already applies some smaller heuristics that look good in
compiler-builtins, I have dropped `use_small_heuristics` that was set in
`libm` because it seems to negatively affect the readibility of anything
working with numbers (e.g. collapsing multiple small `if` expressions
into a single line).
Distribute everything from `libm/` to better locations in the repo.
`libm/libm/*` has not moved yet to avoid Git seeing the move as an edit
to `Cargo.toml`.
Files that remain to be merged somehow are in `etc/libm`.
Unfortunately this means we lose use of the convenient name `gen`, so
this includes a handful of renaming.
We can't increase the edition for `libm` yet due to MSRV, but we can
enable `unsafe_op_in_unsafe_fn` to help make that change smoother in the
future.
Move the workspace configuration to a virtual manifest. This
reorganization makes a more clear separation between package contents
and support files that don't get distributed. It will also make it
easier to merge this repository with `compiler-builtins` which is
planned (builtins had a similar update done in [1]).
LICENSE.txt and README.md are symlinkedinto the new directory to ensure
they get included in the package.
[1]: https://github.com/rust-lang/compiler-builtins/pull/702
In preparation for switching to a virtual manifest, move the `libm`
crate into a subdirectory and update paths to match.
Updating `Cargo.toml` is done in the next commit so git tracks the moved
file correctly.
Benchmarks for [1] seemed to indicate that repository organization for
some reason had an effect on performance, even though the exact same
rustc commands were running (though some with a different order). After
investigating more, it appears that dependencies may have an affect on
inlining thresholds for generic functions.
It is surprising that this happens, we more or less expect that public
functions will be standalone but everything they call will be inlined.
To help ensure this, mark all generic functions `#[inline]` if they
should be merged into the public function.
Zulip discussion at [2].
[1]: https://github.com/rust-lang/libm/pull/533
[2]: https://rust-lang.zulipchat.com/#narrow/channel/182449-t-compiler.2Fhelp/topic/Dependencies.20affecting.20codegen/with/513079387
Since `fmod` is generic, there isn't any need to have the small wrappers
in separate files. Most operations was done in [1] but `fmod` was
omitted until now.
[1]: https://github.com/rust-lang/libm/pull/537
The reorganization PR has caused this to fail once before because every
file shows up as changed. Increase the timeout so this doesn't happen.
We now cancel the job if too many extensive tests are run unless `ci:
allow-many-extensive` is in the PR description, so this helps prevent
the limit being hit by accident.
Error out when too many extensive tests would be run unless `ci:
allow-many-extensive` is in the PR description. This allows us to set a
much higher CI timeout with less risk that a 4+ hour job gets started by
accident.
Sometimes we do refactoring that moves things around and triggers an
extensive test, even though the implementation didn't change. There
isn't any need to run full extensive CI in these cases, so add a way to
skip it from the PR message.
Jobs should just cancel automatically, it isn't ideal that extensive
jobs can continue running for multiple hours after code has been
updated. Use a solution from [1] to do this.
[1]: https://stackoverflow.com/a/72408109/5380651
Splitting into different source files by float size doesn't have any
benefit when the only content is a small function that forwards to the
generic implementation. Combine the source files for all width versions
of:
* ceil
* copysign
* fabs
* fdim
* floor
* fmaximum
* fmaximum_num
* fminimum
* fminimum_num
* ldexp
* scalbn
* sqrt
* truc
fmod is excluded to avoid conflicts with an open PR.
As part of this change move unit tests out of the generic module,
instead testing the type-specific functions (e.g. `ceilf16` rather than
`ceil::<f16>()`). This ensures that unit tests are validating whatever
we expose, such as arch-specific implementations via
`select_implementation!`, which would otherwise be skipped. (They are
still covered by integration tests).