783 Commits

Author SHA1 Message Date
Jacob Bramley
31e17e39c2 Add AArch64 vrnd*_f64 Neon intrinsics.
The LLVM intrinsic doesn't support float64x1_t, but the required
instruction is a scalar form (e.g. `frint32x <Dd>, <Dn>`), so we can
implement these using the scalar intrinsic.

Note that Clang does not support these intrinsics, so they aren't
covered by intrinsic-test. Additional validation is included in this
patch to ensure that we're selecting an instruction with the same
behaviour as the corresponding vector form (which all have
intrinsic-tests).
2023-06-21 18:52:21 +02:00
Jacob Bramley
0459405ea9 Add more AArch64 vrnd intrinsics.
LLVM can't select float64x1_t variants, but float64x2_t variants work.
2023-06-21 18:52:21 +02:00
Jacob Bramley
a9fecd8456 Support AArch32 Neon dotprod intrinsics.
Note that the feature detection requires a recent Linux kernel (v6.2).
2023-06-21 18:52:21 +02:00
Jacob Bramley
1e15fa3f0a Add support for AArch64 i8mm *dot intrinsics.
This includes vsudot and vusdot, which perform mixed-signedness dot
product operations.
2023-06-21 18:52:21 +02:00
Kisaragi Marine
4afdd80422 arm(neon): regenerate code 2023-06-20 00:47:34 +02:00
Amanieu d'Antras
636476bcc1 Remove obscure & rarely used ARM intrinsics
In almost all cases it is preferable to use the stable `asm!` instead of
calling these intrinsics.

This PR removes the following unstable intrinsics:
- `__breakpoint` Clang extension, not part of ACLE.
- `brk`: undocumented
- `_rev*`, `_clz*`, `_rbit*`: use methods on integer types instead
- `__ldrex`, `__strex`: deprecated in ACLE, hard to use correctly
- Register access API: API doesn't match ACLE, better to just use asm

Also considered for deletion, but not included in this PR:
- Barriers: `__isb`, `__dsb`, `__dmb`
- Hints: `__wfi`, `__wfe`, `__sev`, `__sevl`, `__yield`, `__nop`
2023-06-16 16:09:53 +02:00
Scott McMurray
b77fc27aaa Stabilize __m512i, __m512, and __m512d
I didn't include `__m512bh` since `__m128bh` is also unstable.
2023-06-13 22:46:58 +02:00
8051enthusiast
4b9528ec3b Fix documentation for carryless multiplication 2023-05-26 15:21:37 +02:00
Amanieu d'Antras
7586d9dcd1 Fix typo in cfg 2023-05-26 15:20:15 +02:00
Josh Triplett
87c70444d6 Remove ud2 intrinsic (in favor of asm! or abort as needed) 2023-05-25 23:30:24 +02:00
Amanieu d'Antras
01d9b052ea Stabilize AArch64 AES/SHA1/SHA2 intrinsics 2023-05-25 23:29:04 +02:00
Luca Barbato
5ebcf56693 Add vec_splat 2023-05-17 23:16:09 +02:00
Luca Barbato
79a969a616 Add vec_splat_{u,i}{8,16,32} 2023-05-17 23:16:09 +02:00
Luca Barbato
284b9706d0 Add vec_unpackh and vec_unpackl 2023-05-12 11:18:18 +01:00
Luca Barbato
147b864b34 Add vec_packs and vec_packsu 2023-05-12 11:18:18 +01:00
Luca Barbato
76a5034836 Add vec_pack 2023-05-12 11:18:18 +01:00
Luca Barbato
7b080559bb Add vec_mergel and vec_mergeh 2023-05-12 11:18:18 +01:00
Alan Somers
1dcba9edde Implement _mm256_i32scatter_epi64 from AVX512VL 2023-05-08 07:20:40 +01:00
Luca Barbato
bbc8575e9b Add vec_cts and vec_ctu 2023-04-27 13:08:05 -07:00
Luca Barbato
dc115cabbb Add vec_cft 2023-04-27 13:08:05 -07:00
Luca Barbato
94362b595c Split vec_lde tests 2023-04-24 19:02:22 -07:00
Luca Barbato
333d986341 Add vec_ldl 2023-04-24 19:02:22 -07:00
Luca Barbato
9bf9dd4e4d Add vec_lde 2023-04-24 19:02:22 -07:00
Luca Barbato
e50d4d54b5 Remove the altivec and vsx guards
As mentioned in #1402.
2023-04-23 11:52:27 -07:00
Luca Barbato
cdfd5a2721 Add vec_or, vec_xor, vec_nor 2023-04-22 16:03:22 -07:00
Urgau
5d4192a9d6 Remove useless drop (clippy drop_ref and drop_copy lint) 2023-04-21 06:41:55 -07:00
Luca Barbato
416fb2e11b Add vec_any_out 2023-04-13 01:54:42 +01:00
Luca Barbato
b27fcf7ba3 Add vec_any_nan, vec_any_nge, vec_any_ngt, vec_any_nle, vec_any_nlt and vec_any_numeric 2023-04-13 01:54:42 +01:00
Luca Barbato
6e78450cb5 Add vec_all_numeric 2023-04-13 01:54:42 +01:00
Luca Barbato
3e90cd9b62 Add vec_all_nle and vec_all_nlt 2023-04-13 01:54:42 +01:00
Luca Barbato
e85dff6489 Add vec_all_nge and vec_all_ngt 2023-04-13 01:54:42 +01:00
Luca Barbato
3b83d68b9a Add vec_all_nan, vec_all_ne and vec_any_ne 2023-04-13 01:54:42 +01:00
Luca Barbato
e17e7243b1 Add vec_all_lt and vec_any_lt 2023-04-13 01:54:42 +01:00
Luca Barbato
8a97475c66 Add vec_all_le and vec_any_le 2023-04-13 01:54:42 +01:00
Luca Barbato
911ee09056 Add vec_all_in 2023-04-13 01:54:42 +01:00
Luca Barbato
0fa72df240 Add vec_all_gt and vec_any_gt 2023-04-13 01:54:42 +01:00
Luca Barbato
24cc2fdb85 Add vec_all_ge and vec_any_ge 2023-04-13 01:54:42 +01:00
Luca Barbato
c587804586 Add vec_all_eq and vec_any_eq 2023-04-13 01:54:42 +01:00
Luca Barbato
2749cfe2ff Move the altivec macros in a stand alone files 2023-04-13 01:54:42 +01:00
Kisaragi
600533dbdc x86: remove unnecessary parens 2023-04-12 21:41:28 +01:00
Jubilee Young
adc506ddaa Clarify undefined can still mean init
These intrinsics actually are zeroing, so we should be upfront
that these still return a valid value in reality. At most,
we might in the future want to use a "freezing" semantics here.
In any case, they are definitely not returning MaybeUninit,
or any other possibly-poison value.
2023-04-11 13:17:42 +01:00
Amanieu d'Antras
c66be336c6 Fix CI 2023-04-08 21:41:40 +01:00
Taiki Endo
1ec5981759 core_arch: Remove uses of arm crypto target feature 2023-04-01 18:42:27 +01:00
Amanieu d'Antras
fbed7945aa Update intrinsic tests for LLVM 16 2023-03-30 15:43:03 +01:00
Luca Barbato
a331bc9512 Fix the powerpc feature guards
Now the documentation should be generated correctly and vsx isn't always
present.
2023-03-25 23:05:16 +01:00
Amanieu d'Antras
fff3c73f11 Mark more arm_shared intrinsics and types as stable in docs
This is a follow-up to #1345 where these appeared as unstable in the
standard library docs because they are only stabilized for ARM. They
were missed in the original PR.
2023-03-24 23:27:24 +01:00
Luca Barbato
e82aebe756 core_arch: Unbreak powerpc64 tests 2023-03-24 23:27:08 +01:00
Taiki Endo
fe04ca5624 core_arch: Update cmpxchg16b docs and fix feature name
- Fix a typo in the feature name
- Update docs to reflect changes in behavior on invalid ordering in
  stabilized PR: invalid ordering is no longer UB, just causes panic as
  well as compare_exchange
2023-03-19 16:09:29 +01:00
Alex Crichton
be861579df wasm32: Add relaxed simd instructions
This commit adds intrinsics to the `wasm32` to support the [relaxed SIMD
proposal][proposal]. These are added with the same naming conventions of
existing simd-related intrinsics for wasm which is similar to the
instruction name but matches sign in a few places.

This additionally updates Wasmtime to execute tests with support for the
relaxed simd proposal. No release has been made yet so this uses the
`dev` release, and I can make a PR in April when the support in Wasmtime
has been released to an official release. The `wasmprinter` crate is
also updated to understand these instruction opcodes as well.

Documentation has been added for all intrinsics, but tests have only
been added for some of them so far. I hope to follow-up later with more
tests.

[proposal]: https://github.com/WebAssembly/relaxed-simd
2023-03-19 16:08:18 +01:00
Kathryn Long
a63313212b Stabilize f16c intrinsics 2023-03-12 20:19:54 +01:00