58 Commits

Author SHA1 Message Date
Mads Marquart
8a511191a0 Enable feature detection on all Apple/Darwin targets
Tested in the simulator and on the device I had lying around, a 1st
generation iPad Mini (which isn't Aarch64, but shows that the
`sysctlbyname` calls still work even there, even if they return false).

`sysctlbyname` _should_ be safe to use without causing rejections from
the app store, as its usage is documented in:
https://developer.apple.com/documentation/kernel/1387446-sysctlbyname/determining_instruction_set_characteristics

Also, the standard library will use these soon anyhow, so this shouldn't
affect the situation:
https://github.com/rust-lang/rust/pull/129019
2024-09-14 04:25:01 +01:00
sayantn
f22fab559e Implemented VEX versions
Modified stdarch-test to accept VEX versions
2024-07-06 11:00:34 +02:00
sayantn
043f3cc280 Upgraded disassembly to include windows-gnu targets 2024-06-29 19:16:48 +02:00
Amanieu d'Antras
860145884d Ignore int3 instructions when counting instructions in tests
These are generated as padding and are not actually part of the
function.
2024-06-07 19:18:13 +02:00
Daniel Paoliello
613efc499c Enable testing for AArch64 Windows 2024-04-19 17:21:08 +02:00
Daniel Paoliello
a00a70eacb arm64ec 2024-03-13 22:30:36 +00:00
Amanieu d'Antras
3ac4ba6670 Revert "Work around CI failures for the ARM target"
This reverts commit 5a748ec5fabcaee29351ac3c90eee4f3e16964e7.
2023-11-30 08:20:47 +00:00
Jacob Bramley
86cb5730ae Report missing features when skipping tests. 2023-11-30 07:48:46 +00:00
Amanieu d'Antras
4fe088329c Work around CI failures for the ARM target
These seem to have been introduced by recent LLVM changes.

* The instruction limit for vld*/vst* has been raised. This is not a
significant issue, it is only used for testing.
* vld*/vst* instructions are generated with overly strict alignments:
https://github.com/rust-lang/stdarch/issues/1217
* vtbl/vtbx instrinsics are failing intrinsic-test for unknown reasons.
2023-11-30 07:48:09 +00:00
Eduardo Sánchez Muñoz
9f741c5986 Simplify some expressions with pointers and references 2023-10-31 02:20:17 +01:00
Eduardo Sánchez Muñoz
5cdd9f81ce Convert while loop to for 2023-10-31 02:20:17 +01:00
Eduardo Sánchez Muñoz
8950416e20 Bump wasmprinter to 0.2.67 2023-10-10 14:47:43 +01:00
Gijs Burghoorn
0d56394f35 Fix: #1464 for rv64 zb 2023-09-22 10:08:56 +08:00
Gijs Burghoorn
8a23f93e8b Fix: #1464 for rv64 zk 2023-09-22 10:08:56 +08:00
Amanieu d'Antras
17daea9747 Update instruction tests for LLVM 17 2023-08-29 15:21:34 +02:00
Amanieu d'Antras
fff032b929 Fix CI on wasm32-wasi
The cc dependency doesn't compile on wasi, so only include it for
windows targets.
2023-08-29 15:21:34 +02:00
Jacob Bramley
f480c64fe9 Remove assert_instr exception for AArch64 *cvt*.
The LLVM code generation was improved some time ago.
2023-06-09 00:34:38 +02:00
Amanieu d'Antras
acee6b804a stdarch-test: Ignore {evex} prefix emitted by recent objdump 2023-04-08 21:41:40 +01:00
Alex Crichton
be861579df wasm32: Add relaxed simd instructions
This commit adds intrinsics to the `wasm32` to support the [relaxed SIMD
proposal][proposal]. These are added with the same naming conventions of
existing simd-related intrinsics for wasm which is similar to the
instruction name but matches sign in a few places.

This additionally updates Wasmtime to execute tests with support for the
relaxed simd proposal. No release has been made yet so this uses the
`dev` release, and I can make a PR in April when the support in Wasmtime
has been released to an official release. The `wasmprinter` crate is
also updated to understand these instruction opcodes as well.

Documentation has been added for all intrinsics, but tests have only
been added for some of them so far. I hope to follow-up later with more
tests.

[proposal]: https://github.com/WebAssembly/relaxed-simd
2023-03-19 16:08:18 +01:00
bwmf2
1c18225f32 Fix typo 2023-02-18 20:02:17 +01:00
Yuri Astrakhan
81c221f058
Edition 2021, apply clippy::uninlined_format_args fix (#1339) 2022-10-25 20:17:23 +01:00
Chris Wailes
13d20910b7
Update the Android Docker files to Ubuntu 22.04 (#1338) 2022-10-04 09:19:36 +01:00
Charles Lew
676d095f0a Bump cfg-if dependency to 1.0 2022-09-11 13:05:05 +02:00
Sparrow Li
68e35d306f
Complete vld* and vst* neon instructions (#1224) 2021-09-29 04:28:10 +01:00
Sparrow Li
bdea403c54
Complete vst1 neon instructions (#1221) 2021-09-24 13:26:29 +01:00
Hans Kratz
4f8ed0335c
Check inlining and instruction count for assert_instr(nop) as well (#1218) 2021-09-18 01:53:32 +01:00
Hans Kratz
5cd6850171
Normalize [us]shll.* ..., #0 aarch64 disassembly to the preferred [us]xtl.* (#1213) 2021-09-08 23:41:31 +01:00
Hans Kratz
bf2122753a Disable arm inlining check again for now as some tests are still failing. 2021-09-09 00:22:33 +02:00
Hans Kratz
5995d769ad Use a lighter dedup guard in the assert_instr test shims. 2021-09-09 00:22:33 +02:00
Hans Kratz
755e622d17 Implement proper subroutine call detection for x86, x86_64, aarch64 and wasm32. 2021-09-08 19:14:13 +02:00
Hans Kratz
03fa985cf0 remove assembly parsing special case for otool output (no longer needed) 2021-09-08 19:14:13 +02:00
Hans Kratz
999d954aa4 using v8.6a target feature to cover more instructions 2021-09-08 19:14:13 +02:00
Hans Kratz
f5af9d15a9 Use objdump on Macos x86_64 as well. 2021-09-08 19:14:13 +02:00
Hans Kratz
f15c851517 Use LLVM objdump on Macos ARM64 because it is not possible to enable TME support with otool 2021-09-08 19:14:13 +02:00
Sparrow Li
9e34c6d4c8
Add vst neon instructions (#1205)
* add vst neon instructions

* modify the instruction limit
2021-08-31 21:35:30 +01:00
Jamie Cunliffe
0285e513e0 Update arm vcvt intrinsics to use llvm.fpto(su)i.sat
Those intrinsics have the correct semantics for the desired fcvtz instruction,
without any undefined behaviour. The previous simd_cast was undefined for
infinite and NaN which could cause issues.
2021-08-11 13:13:19 +01:00
Sparrow Li
10f7ebc387
Add vfma and vfms neon instructions (#1169) 2021-05-21 12:26:21 +01:00
Sparrow Li
07f1d0cae3
Add vmla_n, vmla_lane, vmls_n, vmls_lane neon instructions (#1145) 2021-04-28 22:59:41 +01:00
surechen
d46e0086e4
add neon instruction vfma_n_* (#1122) 2021-04-17 17:45:54 +01:00
Sebastian Thiel
43126c3f65
[DRAFT] intrinsics for all architectures appear in rustdoc (#1104) 2021-04-17 13:46:33 +01:00
Sparrow Li
88a5de08cb
Allow primitive types in the code generator and add vdup instructions (#1114) 2021-04-12 14:08:26 +01:00
Joshua Nelson
b411a5c375
Convert all crates to 2018 edition (#1109) 2021-04-11 15:26:35 +01:00
Joshua Nelson
7bab2c0695
Deny 2018 idiom lints (#1108)
This lint is allow by default, which is why this wasn't spotted earlier.
It's denied by rust-lang/rust, so it's good to warn about it here so it
can be fixed more quickly.
2021-04-07 05:46:39 +01:00
Alex Crichton
e35da555f8
Update WebAssembly SIMD/Atomics (#1073) 2021-03-11 23:30:30 +00:00
Makoto Kato
9a4ff9fe79
Use --no-show-raw-insn to make disassemble parser simple. (#948) 2020-11-06 21:56:36 +00:00
Joseph Richey
e254082775
Use black_box instead of llvm_asm (#944)
The implementation is the same (where possible), and it unblocks #904

Signed-off-by: Joe Richey <joerichey@google.com>
2020-11-04 17:20:13 +00:00
Makoto Kato
e020a85ff0
Run CI for i686-pc-windows-msvc (#934) 2020-10-25 01:32:27 +01:00
Lokathor
67217c5d11
add more things that do adds (#881)
Co-authored-by: Amanieu d'Antras <amanieu@gmail.com>
2020-08-10 00:03:35 +01:00
Alex Crichton
770964adac
Update and revamp wasm32 SIMD intrinsics (#874)
Lots of time and lots of things have happened since the simd128 support
was first added to this crate. Things are starting to settle down now so
this commit syncs the Rust intrinsic definitions with the current
specification (https://github.com/WebAssembly/simd). Unfortuantely not
everything can be enabled just yet but everything is in the pipeline for
getting enabled soon.

This commit also applies a major revamp to how intrinsics are tested.
The intention is that the setup should be much more lightweight and/or
easy to work with after this commit.

At a high-level, the changes here are:

* Testing with node.js and `#[wasm_bindgen]` has been removed. Instead
  intrinsics are tested with Wasmtime which has a nearly complete
  implementation of the SIMD spec (and soon fully complete!)

* Testing is switched to `wasm32-wasi` to make idiomatic Rust bits a bit
  easier to work with (e.g. `panic!)`

* Testing of this crate's simd128 feature for wasm is re-enabled. This
  will run on CI and both compile and execute intrinsics. This should
  bring wasm intrinsics to the same level of parity as x86 intrinsics,
  for example.

* New wasm intrinsics have been added:
  * `iNNxMM_loadAxA_{s,u}`
  * `vNNxMM_load_splat`
  * `v8x16_swizzle`
  * `v128_andnot`
  * `iNNxMM_abs`
  * `iNNxMM_narrow_*_{u,s}`
  * `iNNxMM_bitmask` - commented out until LLVM is updated to LLVM 11
  * `iNNxMM_widen_*_{u,s}` - commented out until
    bytecodealliance/wasmtime#1994 lands
  * `iNNxMM_{max,min}_{u,s}`
  * `iNNxMM_avgr_u`

* Some wasm intrinsics have been removed:
  * `i64x2_trunc_*`
  * `f64x2_convert_*`
  * `i8x16_mul`

* The `v8x16.shuffle` instruction is exposed. This is done through a
  `macro` (not `macro_rules!`, but `macro`). This is intended to be
  somewhat experimental and unstable until we decide otherwise. This
  instruction has 16 immediate-mode expressions and is as a result
  unsuited to the existing `constify_*` logic of this crate. I'm hoping
  that we can game out over time what a macro might look like and/or
  look for better solutions. For now, though, what's implemented is the
  first of its kind in this crate (an architecture-specific macro), so
  some extra scrutiny looking at it would be appreciated.

* Lots of `assert_instr` annotations have been fixed for wasm.

* All wasm simd128 tests are uncommented and passing now.

This is still missing tests for new intrinsics and it's also missing
tests for various corner cases. I hope to get to those later as the
upstream spec itself gets closer to stabilization.

In the meantime, however, I went ahead and updated the `hex.rs` example
with a wasm implementation using intrinsics. With it I got some very
impressive speedups using Wasmtime:

    test benches::large_default  ... bench:     213,961 ns/iter (+/- 5,108) = 4900 MB/s
    test benches::large_fallback ... bench:   3,108,434 ns/iter (+/- 75,730) = 337 MB/s
    test benches::small_default  ... bench:          52 ns/iter (+/- 0) = 2250 MB/s
    test benches::small_fallback ... bench:         358 ns/iter (+/- 0) = 326 MB/s

or otherwise using Wasmtime hex encoding using SIMD is 15x faster on 1MB
chunks or 7x faster on small <128byte chunks.

All of these intrinsics are still unstable and will continue to be so
presumably until the simd proposal in wasm itself progresses to a later
stage. Additionaly we'll still want to sync with clang on intrinsic
names (or decide not to) at some point in the future.

* wasm: Unconditionally expose SIMD functions

This commit unconditionally exposes SIMD functions from the `wasm32`
module. This is done in such a way that the standard library does not
need to be recompiled to access SIMD intrinsics and use them. This,
hopefully, is the long-term story for SIMD in WebAssembly in Rust.

It's unlikely that all WebAssembly runtimes will end up implementing
SIMD so the standard library is unlikely to use SIMD any time soon, but
we want to make sure it's easily available to folks! This commit enables
all this by ensuring that SIMD is available to the standard library,
regardless of compilation flags.

This'll come with the same caveats as x86 support, where it doesn't make
sense to call these functions unless you're enabling simd support one
way or another locally. Additionally, as with x86, if you don't call
these functions then the instructions won't show up in your binary.

While I was here I went ahead and expanded the WebAssembly-specific
documentation for the wasm32 module as well, ensuring that the current
state of SIMD/Atomics are documented.
2020-07-18 13:32:52 +01:00
Daniel Smith
5bfcdc0d57
Implement AVX512f floating point comparisons (#869)
Co-authored-by: bjorn3 <bjorn3@users.noreply.github.com>
2020-07-15 20:06:38 +01:00