25 Commits

Author SHA1 Message Date
Trevor Gross
71200bc3ce Add fmodf128
This function is significantly slower than all others so includes an
override in `EXTREMELY_SLOW_TESTS`. Without it, PR CI takes ~1hour and
the extensive tests in CI take ~1day.
2025-01-24 08:23:15 +00:00
Trevor Gross
67218cbaa5 Add fmodf16 using the generic implementation 2025-01-24 06:03:59 +00:00
Trevor Gross
08eda86de2 Add a generic version of fmod
This can replace `fmod` and `fmodf`. As part of this change I was able
to replace some of the `while` loops with `leading_zeros`.
2025-01-24 05:55:15 +00:00
Trevor Gross
6d5105c006 Add fminf16, fmaxf16, fminf128, and fmaxf128 2025-01-24 03:01:36 +00:00
Trevor Gross
f94df5d524 Add a generic version of fmin and fmax
These can be used for `fmin`, `fminf`, `fmax`, and `fmaxf`. No changes
to the implementation are made, so [1] is not fixed.

[1]: https://github.com/rust-lang/libm/issues/439
2025-01-24 02:58:00 +00:00
Trevor Gross
d20a5e82a5 Add roundf16 and roundf128 2025-01-24 01:59:10 +00:00
Trevor Gross
cdbe65b503 Add a generic version of round
This replaces `round` and `roundf`.
2025-01-24 01:49:23 +00:00
Trevor Gross
3aa2d1cfc2 Add a generic version of scalbn
This replaces the `f32` and `f64` versions of `scalbn` and `ldexp`.
2025-01-23 16:59:44 -06:00
Trevor Gross
b22398d658 Add rintf16 and rintf128
Use the generic algorithms to provide implementations for these
routines.
2025-01-22 11:04:39 +00:00
Trevor Gross
bf2d171b1a Add a generic version of rint
Use this to implement `rint` and `rintf`.
2025-01-22 11:04:36 +00:00
Trevor Gross
6a8bb0fa80 Add floorf16 and floorf128
Use the generic algorithms to provide implementations for these
routines.
2025-01-22 08:50:06 +00:00
Trevor Gross
42fce292ab Add a generic version of floor
Additionally, make use of this version to implement `floor` and
`floorf`.

Similar to `ceil`, musl'f `ceilf` routine seems to work better for all
float widths than the `ceil` algorithm. Trying with the `ceil` (`f64`)
algorithm produced the following regressions:

    icount::icount_bench_floor_group::icount_bench_floor logspace:setup_floor()
    Performance has regressed: Instructions (14064 > 13171) regressed by +6.78005% (>+5.00000)
      Baselines:                      softfloat|softfloat
      Instructions:                       14064|13171                (+6.78005%) [+1.06780x]
      L1 Hits:                            16821|15802                (+6.44855%) [+1.06449x]
      L2 Hits:                                0|0                    (No change)
      RAM Hits:                               8|9                    (-11.1111%) [-1.12500x]
      Total read+write:                   16829|15811                (+6.43856%) [+1.06439x]
      Estimated Cycles:                   17101|16117                (+6.10535%) [+1.06105x]
    icount::icount_bench_floorf128_group::icount_bench_floorf128 logspace:setup_floorf128()
      Baselines:                      softfloat|softfloat
      Instructions:                      166868|N/A                  (*********)
      L1 Hits:                           221429|N/A                  (*********)
      L2 Hits:                                1|N/A                  (*********)
      RAM Hits:                              34|N/A                  (*********)
      Total read+write:                  221464|N/A                  (*********)
      Estimated Cycles:                  222624|N/A                  (*********)
    icount::icount_bench_floorf16_group::icount_bench_floorf16 logspace:setup_floorf16()
      Baselines:                      softfloat|softfloat
      Instructions:                      143029|N/A                  (*********)
      L1 Hits:                           176517|N/A                  (*********)
      L2 Hits:                                1|N/A                  (*********)
      RAM Hits:                              13|N/A                  (*********)
      Total read+write:                  176531|N/A                  (*********)
      Estimated Cycles:                  176977|N/A                  (*********)
    icount::icount_bench_floorf_group::icount_bench_floorf logspace:setup_floorf()
    Performance has regressed: Instructions (14732 > 10441) regressed by +41.0976% (>+5.00000)
      Baselines:                      softfloat|softfloat
      Instructions:                       14732|10441                (+41.0976%) [+1.41098x]
      L1 Hits:                            17616|13027                (+35.2268%) [+1.35227x]
      L2 Hits:                                0|0                    (No change)
      RAM Hits:                               8|6                    (+33.3333%) [+1.33333x]
      Total read+write:                   17624|13033                (+35.2260%) [+1.35226x]
      Estimated Cycles:                   17896|13237                (+35.1968%) [+1.35197x]
2025-01-22 08:48:11 +00:00
Trevor Gross
9064c42abe Add ceilf16 and ceilf128
Use the generic algorithms to provide implementations for these
routines.
2025-01-22 07:22:32 +00:00
Trevor Gross
186eac9227 Add sqrtf16 and sqrtf128
Use the generic algorithms to provide implementations for these
routines.
2025-01-22 05:31:13 +00:00
Trevor Gross
573ded2ee8 Port the most recent version of Musl's sqrt as a generic algorithm
Musl commit 97e9b73d59 ("math: new software sqrt") adds a new algorithm
using Goldschmidt division. Port this algorithm to Rust and make it
generic, which shows a notable performance improvement over the existing
algorithm.

This also allows adding square root routines for `f16` and `f128`.
2025-01-22 05:31:13 +00:00
Trevor Gross
13b5bf3959 Add fdimf16 and fdimf128
Use the generic algorithms to provide implementations for these
routines.
2025-01-13 14:04:54 +00:00
Trevor Gross
0f285df716 Add a generic version of fdim 2025-01-13 13:49:46 +00:00
Trevor Gross
b558b365d3 Add truncf16 and truncf128
Use the generic algorithms to provide implementations for these
routines.
2025-01-13 10:12:09 +00:00
Hanna Kruppe
87cc064e35 Introduce arch::aarch64 and use it for rint{,f} 2025-01-12 11:26:09 +01:00
Hanna Kruppe
7defd9b429 Use wasm32 arch intrinsics for rint{,f} 2025-01-12 11:25:51 +01:00
Trevor Gross
f3ad123a09 Replace "intrinsic" config with "arch" config
WASM is the only architecture we use `intrinsics::` for. We probably
don't want to do this for any other architectures since it is better to
use assembly, or work toward getting the functions available in `core`.

To more accurately reflect the relationship between arch and intrinsics,
make wasm32 an `arch` module and call the intrinsics from there.
2025-01-06 20:17:01 -05:00
Trevor Gross
6b5e8b20f0 Add test infrastructure for f16 and f128
Update test traits to support `f16` and `f128`, as applicable. Add the
new routines (`fabs` and `copysign` for `f16` and `f128`) to the list of
all operations.
2025-01-06 04:10:51 -05:00
Trevor Gross
aabb7d9dcc Enable f16 and f128 when creating the API change list
Additionally, read glob output as absoulte paths. This enables the
script to work properly even when invoked from a different directory.
2025-01-06 04:10:51 -05:00
Trevor Gross
7c04b1916a Add more detailed definition output for update-api-list.py
Update the script to produce, in addition to the simple text list, a
JSON file listing routine names, the types they work with, and the
source files that contain a function with the routine name. This gets
consumed by another script and will be used to determine which extensive
CI jobs to run.
2025-01-06 02:25:59 -05:00
Trevor Gross
ed72c4ec69 Use rustdoc output to create a list of public API
Rather than collecting a list of file names in `libm-test/build.rs`,
just use a script to parse rustdoc's JSON output.
2025-01-01 11:01:50 +00:00