76 Commits

Author SHA1 Message Date
Alex Crichton
770964adac
Update and revamp wasm32 SIMD intrinsics (#874)
Lots of time and lots of things have happened since the simd128 support
was first added to this crate. Things are starting to settle down now so
this commit syncs the Rust intrinsic definitions with the current
specification (https://github.com/WebAssembly/simd). Unfortuantely not
everything can be enabled just yet but everything is in the pipeline for
getting enabled soon.

This commit also applies a major revamp to how intrinsics are tested.
The intention is that the setup should be much more lightweight and/or
easy to work with after this commit.

At a high-level, the changes here are:

* Testing with node.js and `#[wasm_bindgen]` has been removed. Instead
  intrinsics are tested with Wasmtime which has a nearly complete
  implementation of the SIMD spec (and soon fully complete!)

* Testing is switched to `wasm32-wasi` to make idiomatic Rust bits a bit
  easier to work with (e.g. `panic!)`

* Testing of this crate's simd128 feature for wasm is re-enabled. This
  will run on CI and both compile and execute intrinsics. This should
  bring wasm intrinsics to the same level of parity as x86 intrinsics,
  for example.

* New wasm intrinsics have been added:
  * `iNNxMM_loadAxA_{s,u}`
  * `vNNxMM_load_splat`
  * `v8x16_swizzle`
  * `v128_andnot`
  * `iNNxMM_abs`
  * `iNNxMM_narrow_*_{u,s}`
  * `iNNxMM_bitmask` - commented out until LLVM is updated to LLVM 11
  * `iNNxMM_widen_*_{u,s}` - commented out until
    bytecodealliance/wasmtime#1994 lands
  * `iNNxMM_{max,min}_{u,s}`
  * `iNNxMM_avgr_u`

* Some wasm intrinsics have been removed:
  * `i64x2_trunc_*`
  * `f64x2_convert_*`
  * `i8x16_mul`

* The `v8x16.shuffle` instruction is exposed. This is done through a
  `macro` (not `macro_rules!`, but `macro`). This is intended to be
  somewhat experimental and unstable until we decide otherwise. This
  instruction has 16 immediate-mode expressions and is as a result
  unsuited to the existing `constify_*` logic of this crate. I'm hoping
  that we can game out over time what a macro might look like and/or
  look for better solutions. For now, though, what's implemented is the
  first of its kind in this crate (an architecture-specific macro), so
  some extra scrutiny looking at it would be appreciated.

* Lots of `assert_instr` annotations have been fixed for wasm.

* All wasm simd128 tests are uncommented and passing now.

This is still missing tests for new intrinsics and it's also missing
tests for various corner cases. I hope to get to those later as the
upstream spec itself gets closer to stabilization.

In the meantime, however, I went ahead and updated the `hex.rs` example
with a wasm implementation using intrinsics. With it I got some very
impressive speedups using Wasmtime:

    test benches::large_default  ... bench:     213,961 ns/iter (+/- 5,108) = 4900 MB/s
    test benches::large_fallback ... bench:   3,108,434 ns/iter (+/- 75,730) = 337 MB/s
    test benches::small_default  ... bench:          52 ns/iter (+/- 0) = 2250 MB/s
    test benches::small_fallback ... bench:         358 ns/iter (+/- 0) = 326 MB/s

or otherwise using Wasmtime hex encoding using SIMD is 15x faster on 1MB
chunks or 7x faster on small <128byte chunks.

All of these intrinsics are still unstable and will continue to be so
presumably until the simd proposal in wasm itself progresses to a later
stage. Additionaly we'll still want to sync with clang on intrinsic
names (or decide not to) at some point in the future.

* wasm: Unconditionally expose SIMD functions

This commit unconditionally exposes SIMD functions from the `wasm32`
module. This is done in such a way that the standard library does not
need to be recompiled to access SIMD intrinsics and use them. This,
hopefully, is the long-term story for SIMD in WebAssembly in Rust.

It's unlikely that all WebAssembly runtimes will end up implementing
SIMD so the standard library is unlikely to use SIMD any time soon, but
we want to make sure it's easily available to folks! This commit enables
all this by ensuring that SIMD is available to the standard library,
regardless of compilation flags.

This'll come with the same caveats as x86 support, where it doesn't make
sense to call these functions unless you're enabling simd support one
way or another locally. Additionally, as with x86, if you don't call
these functions then the instructions won't show up in your binary.

While I was here I went ahead and expanded the WebAssembly-specific
documentation for the wasm32 module as well, ensuring that the current
state of SIMD/Atomics are documented.
2020-07-18 13:32:52 +01:00
Mahmut Bulut
4541757677 feature detection 2020-05-29 19:05:48 +01:00
Mahmut Bulut
17e4b29dfd Implementation for Aarch64 TME intrinsics 2020-05-29 19:05:48 +01:00
Tobias Kortkamp
a69b5ec7ae Unbreak non-x86 build on FreeBSD
error[E0432]: unresolved import `self::arm::check_for`
  --> src/libstd/../stdarch/crates/std_detect/src/detect/os/freebsd/mod.rs:11:17
   |
11 |         pub use self::arm::check_for;
   |                 ^^^^^^^^^^^^^^^^^^^^ no `check_for` in `std_detect::detect::os::arm`

error[E0425]: cannot find value `detect_features` in module `self::os`
   --> src/libstd/../stdarch/crates/std_detect/src/detect/mod.rs:121:37
    |
121 |     cache::test(x as u32, self::os::detect_features)
    |                                     ^^^^^^^^^^^^^^^ not found in `self::os`
    |
help: possible candidate is found in another module, you can import it into scope
    |
20  | use crate::std_detect::detect::os::arm::detect_features;
2020-04-24 12:45:05 +01:00
Amanieu d'Antras
39fc893f6b Stabilize all remaining x86 features for feature detection 2020-04-24 00:36:01 +01:00
Amanieu d'Antras
04c1a9a9e9
Use llvm_asm! instead of asm! (#846) 2020-04-09 00:05:10 +01:00
Linus Färnstrand
f14b746319 Replace all max/min_value() with MAX/MIN 2020-04-04 09:51:11 -07:00
Linus Färnstrand
b852344de5
Replace module MIN/MAX and min/max_value() with assoc consts (#843) 2020-03-29 17:08:21 +01:00
Makoto Kato
09ef01ade1
Add crypto target feature detection to arm32 (#833) 2020-03-29 12:28:17 +01:00
Jack O'Connor
e367bcd7f9
re-stabilize the AVX-512 features that were stabilized in Rust 1.27.0 (#842)
* re-stabilize the AVX-512 features that were stabilized in Rust 1.27.0

https://github.com/rust-lang/stdarch/pull/739 added per-feature
stabilization of runtime CPU feature detection. In so doing, it
de-stabilized some detection features that had been stable since Rust
1.27.0, breaking some published crates (on nightly). This commit
re-stabilizes the subset of AVX-512 detection features that were
included in 1.27.0 (that is, the pre-Ice-Lake subset). Other instruction
sets (MMX in particular) remain de-stabilized, pending a decision about
whether should ever stabilize them.

See https://github.com/rust-lang/rust/issues/68905.

* add a comment explaining feature detection stability

* adjust stabilizations to match most recent proposal

https://github.com/rust-lang/rust/issues/68905#issuecomment-595376319
2020-03-19 14:29:50 +00:00
Aleksey Kladov
0bd16446db Fix race condition in feature cache on 32 platforms (#837)
* Fix race condition in feature cache on 32 platforms

If we observe that the second word is initialized, we can't really
assume that the first is initialized as well. So check each word
separately.

* Use stronger atomic ordering

Better SeqCst than sorry!

* Use two caches on x64 for simplicity
2020-01-28 21:53:17 +01:00
Luca Barbato
1601ce4f2f Add Icelake avx512 features (#838)
* Add Icelake avx512 features

As documented in https://software.intel.com/sites/default/files/managed/c5/15//architecture-instruction-set-extensions-programming-reference.pdf

* Sort the avx512 feature checks by bit

* Unbreak macos

Force nightly.
2020-01-26 13:10:29 -06:00
Yuki Okushi
c8c587d0cd Use issue = "none" instead of "0" 2019-12-27 11:25:13 +01:00
Makoto Kato
f5783f5193 Run-time feature detection for Aarch64 on Windows. 2019-12-11 12:24:03 +01:00
Makoto Kato
cca9a86637 Add CRC32 detection to arm32
armv8 has 32-bit mode, but it can use crc32 instruction sets even if 32-bit.
2019-12-02 19:23:05 +01:00
Alex Crichton
036b6348d9 Remove need for #[macro_use] with cfg-if
Modernizes usage of `cfg_if!` slightly
2019-10-10 12:43:27 +02:00
Taiki Endo
cd7aa7720a Remove azure pipelines badges 2019-10-10 12:42:41 +02:00
gnzlbg
128aa330ea Feature::from_str is not always needed 2019-09-18 12:09:07 +02:00
gnzlbg
579e4cc655 std_detect_env_override should be disabled by default 2019-09-18 12:09:07 +02:00
gnzlbg
88fe414dd3 These items do not need to be public 2019-09-18 12:09:07 +02:00
Luca Barbato
5bec3383c9 Drop the features test for now 2019-09-18 09:03:42 +02:00
Luca Barbato
a4dddb4b2f Unbreak non-x86 2019-09-18 09:03:42 +02:00
Luca Barbato
e0d42221ff Implement a fallback for the No-op Feature 2019-09-17 20:59:31 +02:00
Luca Barbato
8cad95c8ab Move the tests away from the code 2019-09-17 19:22:18 +02:00
Luca Barbato
efd19f4a13 Add a test for the env_override 2019-09-17 19:22:18 +02:00
Luca Barbato
b70d574394 Make the test function smaller 2019-09-17 19:22:18 +02:00
Luca Barbato
ee35b1848e Simplify the std imports 2019-09-17 19:22:18 +02:00
Luca Barbato
33688eaa10 Remove the FIXME about the cache size checks
And leave a NOTE.
2019-09-17 19:22:18 +02:00
Luca Barbato
6420fa4fb0 Override the features detected using an env::var
Fixes: #804
2019-09-17 19:22:18 +02:00
Luca Barbato
1855195f40 Add a mean to unset a bit in the cache 2019-09-17 15:36:02 +02:00
gnzlbg
13fffd5fde Try harder to error on usage of unstable features 2019-09-17 02:43:48 +02:00
gnzlbg
42b7041e94 Remove staged_api from the allowed_internal_unstabled of the feature macros 2019-09-17 01:35:26 +02:00
gnzlbg
4821a68959 Fix std_detect on targets without feature detection 2019-09-16 23:43:01 +02:00
gnzlbg
226b3265c8 Format 2019-09-16 23:43:01 +02:00
gnzlbg
599bcf28ad Enforce staged_api on a per-feature basis 2019-09-16 23:43:01 +02:00
gnzlbg
1f44c1407d Add std_detect::detect::features() -> impl Iterator<Item=(&'static str, bool)> API 2019-09-16 23:43:01 +02:00
Luca Barbato
f3140f4b25 Factor out check_for
All the os-specific code implements a `check_for` and a `detect_features`.

Move the always identical check_for in the mod.rs and use
`os::detect_features` there.
2019-09-09 22:20:10 +02:00
Luca Barbato
5b11935d43 Document how miri support works
Co-Authored-By: gnzlbg <gnzlbg@users.noreply.github.com>
2019-09-06 15:01:26 +02:00
Luca Barbato
430744f46a Minimal miri support
Should address https://github.com/rust-lang/miri/issues/932
2019-09-06 15:01:26 +02:00
atouchet
1422e0f95c Fix more links 2019-08-18 14:46:04 +02:00
gnzlbg
00e10f12ce Update badges 2019-08-13 18:04:22 +02:00
gnzlbg
686b813f5d Update repo name 2019-07-09 01:37:07 +02:00
hygonsoc
6369621e79 add Hygon Dhyana CPU Vendor ID("HygonGenuine") checking
As Hygon Dhyana originates from AMD technology and shares most of the architecture with
AMD's family 17h, but with different CPU Vendor ID("HygonGenuine")/Family series number(Family 18h).

for CPUID feature bits, Hygon Dhyana(family 18h) share the same definition with AMD family 17h.
AMD CPUID specification is https://www.amd.com/system/files/TechDocs/25481.pdf.

Related Hygon kernel patch can be found on
http://lkml.kernel.org/r/5ce86123a7b9dad925ac583d88d2f921040e859b.1538583282.git.puwen@hygon.cn
2019-05-25 15:51:21 +02:00
Tobias Kortkamp
491b7c0c53 Fix build of auxvec.rs on FreeBSD/powerpc64
```
error[E0432]: unresolved import `mem`
  --> src/libstd/../stdsimd/crates/std_detect/src/detect/os/freebsd/auxvec.rs:45:9
   |
45 |     use mem;
   |         ^^^ no `mem` external crate

error: aborting due to previous error

For more information about this error, try `rustc --explain E0432`.
error: Could not compile `std`.
```
Tested by @pkubaj in https://reviews.freebsd.org/D20332
2019-05-23 09:51:39 +02:00
MikaelUrankar
a2b98a167e Fix detection of power8
The power8 feature is defined in hwcap2
2019-05-13 06:06:20 +02:00
miki
a62067658d Add std_detect for FreeBSD armv6, armv7 and powerpc64 2019-05-09 16:03:06 +02:00
gnzlbg
6d59dc14ab Update f16c intrinsics to use the f16c target feature 2019-05-09 13:42:20 +02:00
gnzlbg
d31cc0b09e Add runtime feature detection for F16C 2019-05-09 13:42:20 +02:00
tyler
26d6e048cc add rtm cpu feature intrinsics 2019-04-25 09:39:47 +02:00
gnzlbg
503b3f641e Bump patch versions 2019-04-17 14:49:15 +02:00