453 Commits

Author SHA1 Message Date
Aaron Hill
dd11a4b07b
Remove trailing semicolons from several macro definitions (#938)
The x86 code contains several macros that following this pattern:

```rust
macro_rules! expr {
    () => { true; }
}

fn bar(_val: bool) {}

fn main() {
    bar(expr!());
}
```

Here, we have a macro `expr!` that expands to tokens sequence with
a trailing semicolon.

Currently, the trailing semicolon is ignored when the macro is invoked
in expression position, due to https://github.com/rust-lang/rust/issues/33953
If this behavior is changed, then a large number of macro invocations in
`stdarch` will stop compiling.

Regardless of whether nor not this change is made, removing the
semicolon more clearly expresses the intent of the code - these macros
are designed to expand to the result of a function call, not ignore its
results (as the `;` would suggest).
2020-11-02 00:54:21 +00:00
Joshua Nelson
33355e69c2
Fix some clippy lints (#937) 2020-11-02 00:53:39 +00:00
Adam Hillier
56e4b3dd1f
Add shift-and-insert Arm intrinsics. (#936) 2020-11-02 00:44:25 +00:00
Adam Hillier
8f8df056ca
Add popcount Arm intrinsics. (#935) 2020-10-26 18:21:24 +00:00
Makoto Kato
e020a85ff0
Run CI for i686-pc-windows-msvc (#934) 2020-10-25 01:32:27 +01:00
Makoto Kato
ddecf15383
Support ARM crypto extension on A32/T32 (#929) 2020-10-20 05:07:31 +01:00
minybot
ae707fa29d
Avx512f (#927) 2020-10-17 01:14:41 +01:00
Guillaume Gomez
50c46fa268
Fix URLs (#928) 2020-10-14 20:46:52 +01:00
minybot
9090eec2f7
Avx512f (#921) 2020-10-10 17:14:15 +01:00
Ralf Jung
6c6f5e6b87
replace some unions by transmute and make the rest repr(C) (#925) 2020-10-06 18:18:15 +01:00
Jubilee
28f43f9fe5
Fix another "stdimd" typo (#923) 2020-10-03 20:24:07 +01:00
Joshua Nelson
fe2a752e8d
Remove cfg(not(doc)) from doctests (#922)
This was changed from `cfg(dox)` to `cfg(doc)` in
https://github.com/rust-lang/stdarch/pull/920. `cfg(doc)` is incorrect
here; rustdoc sets `cfg(doctest)`, not `cfg(doc)` in doc-tests.

However, this piece of code isn't needed anyway: this code will only
ever be run as a doc-test, so it will never be compiled in.
2020-10-01 14:35:08 +01:00
Joshua Nelson
7f7ca407ff
Replace cfg(dox) with cfg(doc) (#920)
`dox` has to be set explicitly, but `doc` is set whenever rustdoc runs.
This simplifies the doc process to `cargo doc` and means bootstrap can
stop passing `--cfg dox` when documenting crates.
2020-09-29 23:57:19 +01:00
Dong Bo
eba81a5a50
remove redundant feature declaration const_fn_transmute in lib.rs (#919) 2020-09-27 13:44:10 +01:00
minybot
4fd4980774
Avx512f (#917) 2020-09-26 15:47:20 +01:00
Dong Bo
4eefe3f4ab
Implement prefetch hints for aarch64 (#918)
Co-authored-by: Wang Maozhang <wangmaozhang@huawei.com>
2020-09-26 02:37:57 +01:00
Ivan Tham
268ce21837
Switch to intra-doc links (#914) 2020-09-21 13:59:03 +01:00
minybot
99f0dac00e
Avx512f (#912) 2020-09-19 22:16:01 +01:00
Jubilee
c0d49aec61
Fix typo stdimd -> stdsimd (#915) 2020-09-18 23:22:12 +01:00
Thom Chiovoloni
6a0969d12f
Optimize std_detect's caching (#908) 2020-09-17 02:43:25 +01:00
Camelid
5ce2b53048
Remove old TODO (#911) 2020-09-17 00:19:46 +01:00
minybot
4a08ae849b
Avx512f (#907) 2020-09-15 18:04:18 +01:00
Jeff Muizelaar
7fab649ef2
Properly escape the '[' and ']' (#910) 2020-09-15 14:03:20 +01:00
Jeff Muizelaar
53ff7829b7
Add vminq_f32 and vmaxq_f32 (#905) 2020-09-14 07:24:27 +01:00
Jeff Muizelaar
394940b950
Add vcvtq_u32_f32 and vcvtq_s32_f32 (#902) 2020-09-13 21:11:32 +01:00
minybot
cf1adeba7a
Avx512f (#901) 2020-09-11 22:26:39 +01:00
Jeff Muizelaar
d6e2546615
Add vgetq_lane_s32 (#903) 2020-09-10 18:58:03 +01:00
Jeff Muizelaar
5b3d026e21
Add vld1q_s32 and vld1q_u32 (#899) 2020-09-08 21:52:58 +01:00
Jeff Muizelaar
78c5f04228
Add vld1q_dup_f32 (#897) 2020-09-08 14:39:56 +01:00
jethrogb
e8a9e43f93
Re-land mm_extract_epi fix (#898)
This reverts commit 311d56cd91609c1c1c0370cbd2ece8e3048653a5.

Co-authored-by: Jethro Beekman <jethro@fortanix.com>
2020-09-08 14:38:43 +01:00
minybot
3f982e086d
Avx512f (#896) 2020-09-08 12:59:57 +01:00
Jeff Muizelaar
51ca88d3a6
Add vld1q_f32 (#892)
The alignment requirements should match the pointer type. See
llvm commit 8beaba13b8a61697008854b82ed3b45377af9d9d
2020-09-07 21:50:55 +01:00
Jeff Muizelaar
6f97356f7f
Reformat avx512 (#894) 2020-09-07 20:45:20 +01:00
Caleb Zulawski
63af5f371c
Remove requirement on neon feature for arm (#893) 2020-09-07 01:47:04 +01:00
minybot
b11ca63e7b
Avx512 (#891) 2020-09-04 23:06:48 +01:00
Mateusz Mikuła
c06b820716
Bye bye MMX! (#890) 2020-09-03 14:12:19 +01:00
minybot
3bbfade4c9
Avx512 (#887) 2020-08-29 01:55:49 +01:00
Daniel Liu
da3ba684ce
Fixed typos in the docs for AVX2 subtraction (#886) 2020-08-28 15:42:58 +01:00
Pietro Albini
43006f68bd
Remove cfg(not(bootstrap)) (#885) 2020-08-27 01:07:03 +01:00
minybot
1edc72e825
add some avx512f intrinsics(mask, rotation, shift) (#884) 2020-08-25 01:29:47 +01:00
Lokathor
67217c5d11
add more things that do adds (#881)
Co-authored-by: Amanieu d'Antras <amanieu@gmail.com>
2020-08-10 00:03:35 +01:00
Samrat Man Singh
3e19b9879a
Fix typo in doc for _mm256_permute2f128_si256 (#880) 2020-08-03 16:33:35 +01:00
Georgio Nicolas
d7660eb8d5
Explain the discrepancy in the mask type for _mm_shuffle_ps (#879) 2020-08-01 14:35:03 +01:00
Alex Crichton
9a3b159e83
Partially revert #868 (#878)
This commit partially reverts #868 to restore the intrinsics to their
original implementation to avoid breaking changes. This is done while
rust-lang/rust#73166 is running through crater, and should unblock
rust-lang/rust#74482.
2020-07-28 16:29:35 +00:00
Lokathor
ce4277d977
[Neon] Absolute Value fns (#877) 2020-07-20 08:24:29 +01:00
bjorn3
b93f41cbb3
Constify all x86 rustc_args_required_const intrinsics (#876) 2020-07-19 15:45:51 +01:00
Alex Crichton
770964adac
Update and revamp wasm32 SIMD intrinsics (#874)
Lots of time and lots of things have happened since the simd128 support
was first added to this crate. Things are starting to settle down now so
this commit syncs the Rust intrinsic definitions with the current
specification (https://github.com/WebAssembly/simd). Unfortuantely not
everything can be enabled just yet but everything is in the pipeline for
getting enabled soon.

This commit also applies a major revamp to how intrinsics are tested.
The intention is that the setup should be much more lightweight and/or
easy to work with after this commit.

At a high-level, the changes here are:

* Testing with node.js and `#[wasm_bindgen]` has been removed. Instead
  intrinsics are tested with Wasmtime which has a nearly complete
  implementation of the SIMD spec (and soon fully complete!)

* Testing is switched to `wasm32-wasi` to make idiomatic Rust bits a bit
  easier to work with (e.g. `panic!)`

* Testing of this crate's simd128 feature for wasm is re-enabled. This
  will run on CI and both compile and execute intrinsics. This should
  bring wasm intrinsics to the same level of parity as x86 intrinsics,
  for example.

* New wasm intrinsics have been added:
  * `iNNxMM_loadAxA_{s,u}`
  * `vNNxMM_load_splat`
  * `v8x16_swizzle`
  * `v128_andnot`
  * `iNNxMM_abs`
  * `iNNxMM_narrow_*_{u,s}`
  * `iNNxMM_bitmask` - commented out until LLVM is updated to LLVM 11
  * `iNNxMM_widen_*_{u,s}` - commented out until
    bytecodealliance/wasmtime#1994 lands
  * `iNNxMM_{max,min}_{u,s}`
  * `iNNxMM_avgr_u`

* Some wasm intrinsics have been removed:
  * `i64x2_trunc_*`
  * `f64x2_convert_*`
  * `i8x16_mul`

* The `v8x16.shuffle` instruction is exposed. This is done through a
  `macro` (not `macro_rules!`, but `macro`). This is intended to be
  somewhat experimental and unstable until we decide otherwise. This
  instruction has 16 immediate-mode expressions and is as a result
  unsuited to the existing `constify_*` logic of this crate. I'm hoping
  that we can game out over time what a macro might look like and/or
  look for better solutions. For now, though, what's implemented is the
  first of its kind in this crate (an architecture-specific macro), so
  some extra scrutiny looking at it would be appreciated.

* Lots of `assert_instr` annotations have been fixed for wasm.

* All wasm simd128 tests are uncommented and passing now.

This is still missing tests for new intrinsics and it's also missing
tests for various corner cases. I hope to get to those later as the
upstream spec itself gets closer to stabilization.

In the meantime, however, I went ahead and updated the `hex.rs` example
with a wasm implementation using intrinsics. With it I got some very
impressive speedups using Wasmtime:

    test benches::large_default  ... bench:     213,961 ns/iter (+/- 5,108) = 4900 MB/s
    test benches::large_fallback ... bench:   3,108,434 ns/iter (+/- 75,730) = 337 MB/s
    test benches::small_default  ... bench:          52 ns/iter (+/- 0) = 2250 MB/s
    test benches::small_fallback ... bench:         358 ns/iter (+/- 0) = 326 MB/s

or otherwise using Wasmtime hex encoding using SIMD is 15x faster on 1MB
chunks or 7x faster on small <128byte chunks.

All of these intrinsics are still unstable and will continue to be so
presumably until the simd proposal in wasm itself progresses to a later
stage. Additionaly we'll still want to sync with clang on intrinsic
names (or decide not to) at some point in the future.

* wasm: Unconditionally expose SIMD functions

This commit unconditionally exposes SIMD functions from the `wasm32`
module. This is done in such a way that the standard library does not
need to be recompiled to access SIMD intrinsics and use them. This,
hopefully, is the long-term story for SIMD in WebAssembly in Rust.

It's unlikely that all WebAssembly runtimes will end up implementing
SIMD so the standard library is unlikely to use SIMD any time soon, but
we want to make sure it's easily available to folks! This commit enables
all this by ensuring that SIMD is available to the standard library,
regardless of compilation flags.

This'll come with the same caveats as x86 support, where it doesn't make
sense to call these functions unless you're enabling simd support one
way or another locally. Additionally, as with x86, if you don't call
these functions then the instructions won't show up in your binary.

While I was here I went ahead and expanded the WebAssembly-specific
documentation for the wasm32 module as well, ensuring that the current
state of SIMD/Atomics are documented.
2020-07-18 13:32:52 +01:00
Ivan Tham
7f78306761
Add _mm_loadu_si64 (#870)
Co-authored-by: Amanieu d'Antras <amanieu@gmail.com>
2020-07-16 18:01:46 +01:00
Daniel Smith
5bfcdc0d57
Implement AVX512f floating point comparisons (#869)
Co-authored-by: bjorn3 <bjorn3@users.noreply.github.com>
2020-07-15 20:06:38 +01:00
Shamir Khodzha
78135e1774
added f32 and f64 unaligned stores and loads from avx512f set (#873) 2020-07-11 09:02:07 +01:00