2629 Commits

Author SHA1 Message Date
bors
ce4beebecb Auto merge of #146683 - clarfonthey:safe-intrinsics, r=RalfJung,Amanieu
Mark float intrinsics with no preconditions as safe

Note: for ease of reviewing, the list of safe intrinsics is sorted in the first commit, and then safe intrinsics are added in the second commit.

All *recently added* float intrinsics have been correctly marked as safe to call due to the fact that they have no preconditions. This adds the remaining float intrinsics which are safe to call to the safe intrinsic list, and removes the unsafe blocks around their calls.

---

Side note: this may want a try run before being added to the queue, since I'm not sure if there's any tier-2 code that uses these intrinsics that might not be tested on the usual PR flow. We've already uncovered a few places in subtrees that do this, and it's worth double-checking before clogging up the queue.
2025-09-22 14:35:46 +00:00
ltdk
055e05a338 Mark float intrinsics with no preconditions as safe 2025-09-21 20:37:51 -04:00
Sayantan Chakraborty
c1242fab74
Merge pull request #1921 from a4lg/riscv-inline-asm-general-improvements
RISC-V: Improvements of inline assembly uses
2025-09-15 18:39:49 +00:00
Folkert de Vries
5dd0fdcd67
Merge pull request #1919 from sayantn/fix-vreinterpret
Remove big-endian swizzles from `vreinterpret`
2025-09-15 08:18:20 +00:00
Tsukasa OI
8df078a3f0 RISC-V: Improvements of inline assembly uses
This commit performs various improvements (better register allocation,
less register clobbering on the worst case and better readability) of
RISC-V inline assembly use cases.

Note that it does not change the `p` module (which defines the "P"
extension draft instructions but very likely to change).

1.  Use `lateout` as possible.
    Unlike `out(reg)` and `in(reg)` pair, `lateout(reg)` and `in(reg)`
    can share the same register because they state that the late-output
    register is written after all the reads are performed.
    It can improve register allocation.
2.  Add `preserves_flags` option as possible.
    While RISC-V doesn't have _regular_ condition codes, RISC-V inline
    assembly in the Rust language assumes that some registers
    (mainly vector state registers) may be overwritten by default.
    By adding `preserves_flags` to the intrinsics corresponding
    instructions without overwriting them, it can minimize register
    clobbering on the worst case.
3.  Use trailing semicolon.
    As `asm!` declares an action and it doesn't return a value by
    itself, it would be better to have trailing semicolon to denote that
    an `asm!` call is effectively a statement.
4.  Make most of `asm!` calls multi-lined.
    `rustfmt` makes some simple (yet long) `asm!` calls multi-lined but
    it does not perform formatting of complex `asm!` calls with inputs
    and/or outputs.  To keep consistency, it makes most of the `asm!`
    calls multi-lined.
2025-09-14 05:08:19 +00:00
Tsukasa OI
05133f2115 examples: Make Clippy happy 2025-09-12 11:51:38 +00:00
Tsukasa OI
a3b7aad20f stdarch-gen-arm: Make Clippy happy 2025-09-12 11:50:51 +00:00
Tsukasa OI
221eb1f0d5 intrinsic-test: Make Clippy happy 2025-09-12 11:50:25 +00:00
Sayantan Chakraborty
269cecc91c
Merge pull request #1918 from a4lg/riscv-aes64im-lower-requirements
RISC-V: "Lower" requirements of `aes64im`
2025-09-11 19:59:18 +00:00
sayantn
bb31725e67
Remove big-endian swizzles from vreinterpret 2025-09-12 01:20:34 +05:30
Tsukasa OI
e54cc43867 RISC-V: "Lower" requirements of aes64im
This instruction is incorrectly categorized as the same one as
`aes64ks1i` and `aes64ks2` (that should require `zkne || zknd` but
currently require `zkne && zknd`) but `aes64im` only requires
the Zknd extension.

This commit fixes the category of this intrinsic (lowering the
requirements from the Rust perspective but it does not actually lower
it from the RISC-V perspective).
2025-09-11 06:42:10 +00:00
WANG Rui
614dab3ed2 loongarch: Align intrinsic signatures with LLVM 2025-09-10 23:10:19 +08:00
Folkert de Vries
4b549a7330
move target-specific definitions into constants 2025-09-07 14:11:02 +02:00
Folkert de Vries
ccec202727
move build_c_file and build_rust_file into SupportedArchitectureTest 2025-09-07 14:11:02 +02:00
Folkert de Vries
1697f36225
remove trait IntrinsicDefinition 2025-09-07 14:11:02 +02:00
Folkert de Vries
2ba0a6e489
move print_result_c into the trait 2025-09-07 14:11:02 +02:00
Folkert de Vries
9d9ca01bfa
move print_result_c into the inner intrinsic type 2025-09-07 14:11:02 +02:00
Folkert de Vries
916424f38d
move more constants into SupportedArchitectureTest 2025-09-07 14:11:01 +02:00
Folkert de Vries
6ab097b245
move platform headers into SupportedArchitectureTest 2025-09-07 14:11:01 +02:00
Folkert de Vries
d70ef4f0a7
move compare_outputs implementation into SupportedArchitectureTest definition 2025-09-07 14:11:01 +02:00
Folkert de Vries
589515bc8a
update Cargo.lock 2025-09-07 14:11:01 +02:00
Folkert de Vries
93101b5783
s390x: use the new u128::funnel_shl 2025-09-06 14:32:36 +02:00
Folkert de Vries
e1a3b8bdc1
Merge pull request #1911 from nikic/remove-hack
Remove some llvm workarounds
2025-09-03 13:16:03 +00:00
Tsukasa OI
4679533732 RISC-V: Lower requirements of clmul and clmulh
They don't need full "Zbc" extension but only its subset: the "Zbkc"
extension.  Since the compiler implies `zbkc` from `zbc`, it's safe to
use `#[target_feature(enable = "zbkc")]`.
2025-09-03 02:13:35 +00:00
Folkert de Vries
bb3598e481
use qemu-user instead of qemu-user-static for loongarch CI 2025-09-02 10:52:49 +02:00
Nikita Popov
18fa6d917c Remove some llvm workarounds 2025-09-02 10:48:42 +02:00
Amanieu d'Antras
bbb222f9ea
Merge pull request #1906 from folkertdev/arm-roundeven
use `llvm.roundeven` on arm
2025-08-29 11:28:37 +00:00
Amanieu d'Antras
00b27f49eb Remove FreeBSD CI
It's historically been flaky and is no longer needed now that std_detect
has been moved out of this repository.
2025-08-29 11:24:34 +01:00
Folkert de Vries
ae648be783
use llvm.roundeven on arm 2025-08-29 12:15:41 +02:00
Amanieu d'Antras
b2189b8ff6
Merge pull request #1903 from folkertdev/s390x-llvm-21-fixes
`s390x` llvm 21 improvements
2025-08-21 20:31:06 +00:00
Folkert de Vries
98bd1d7445
use simd_saturating_{add, sub} on neon 2025-08-21 10:25:00 +02:00
Amanieu d'Antras
0b0c42478f
Merge pull request #1901 from folkertdev/wasm-read-unaligned
wasm: use `{read, write}_unaligned` methods
2025-08-20 20:44:05 +00:00
Folkert de Vries
6d74280ae4
Merge pull request #1899 from dpaoliello/arm64ec
Add testing for Arm64EC Windows
2025-08-20 20:42:51 +00:00
Folkert de Vries
45af206618
s390x: link to a missed optimization 2025-08-20 22:20:30 +02:00
Folkert de Vries
e9162f221a
s390x: implement vec_sld using fshl 2025-08-20 22:20:30 +02:00
Folkert de Vries
dfa95c6fa4
s390x: implement vec_subc_u128 using overflowing_sub 2025-08-20 22:20:29 +02:00
Folkert de Vries
e1a1b1ded2
s390x: implement vec_mulo using core::intrinsics::simd 2025-08-20 22:20:28 +02:00
Folkert de Vries
d5cb1c49fa
wasm: use {read, write}_unaligned methods 2025-08-20 22:11:32 +02:00
Folkert de Vries
1cda88aca1
s390x: implement vec_mule using core::intrinsics::simd 2025-08-20 22:11:16 +02:00
Folkert de Vries
97d64665b9
s390x: add assert_instr for vec_extend 2025-08-20 22:11:16 +02:00
Folkert de Vries
c5ec0960f0
s390x: add assert_instr for vec_round 2025-08-20 22:11:16 +02:00
Folkert de Vries
fa163a1fca
s390x: define unpack_low using core::intrinsics::simd 2025-08-20 22:11:15 +02:00
Nikita Popov
3302e3e09a Adjust immediate for vrndscalepd tests
The immediate here encodes both the rounding mode (in the low bits)
and the scale (in the high bits). Make sure the scale is non-zero.
2025-08-20 11:23:46 +02:00
Nikita Popov
92f6310890 Work around selection failure without avx512vl 2025-08-20 11:23:46 +02:00
Nikita Popov
4a8b8231b1 Add missing avx512vl target features 2025-08-20 11:23:46 +02:00
Nikita Popov
135de7c8df Use intrinsics for some s390x operations 2025-08-20 11:23:30 +02:00
Nikita Popov
f9bc63d78f Drop no longer needed feature gates 2025-08-20 11:23:30 +02:00
sayantn
100d19ce5b
Stabilize sse4a and tbm target features
- remove some stabilized target features from `gate.rs`
2025-08-14 02:07:40 +05:30
Daniel Paoliello
f2c0c3dd44 Add testing for Arm64EC Windows 2025-08-10 13:19:06 -07:00
Folkert de Vries
de01bd3c72
use IntoIterator for the add_flags methods 2025-08-05 12:42:03 +02:00