mirror of
https://github.com/rust-lang/rust.git
synced 2025-10-13 15:51:22 +00:00

Lots of time and lots of things have happened since the simd128 support was first added to this crate. Things are starting to settle down now so this commit syncs the Rust intrinsic definitions with the current specification (https://github.com/WebAssembly/simd). Unfortuantely not everything can be enabled just yet but everything is in the pipeline for getting enabled soon. This commit also applies a major revamp to how intrinsics are tested. The intention is that the setup should be much more lightweight and/or easy to work with after this commit. At a high-level, the changes here are: * Testing with node.js and `#[wasm_bindgen]` has been removed. Instead intrinsics are tested with Wasmtime which has a nearly complete implementation of the SIMD spec (and soon fully complete!) * Testing is switched to `wasm32-wasi` to make idiomatic Rust bits a bit easier to work with (e.g. `panic!)` * Testing of this crate's simd128 feature for wasm is re-enabled. This will run on CI and both compile and execute intrinsics. This should bring wasm intrinsics to the same level of parity as x86 intrinsics, for example. * New wasm intrinsics have been added: * `iNNxMM_loadAxA_{s,u}` * `vNNxMM_load_splat` * `v8x16_swizzle` * `v128_andnot` * `iNNxMM_abs` * `iNNxMM_narrow_*_{u,s}` * `iNNxMM_bitmask` - commented out until LLVM is updated to LLVM 11 * `iNNxMM_widen_*_{u,s}` - commented out until bytecodealliance/wasmtime#1994 lands * `iNNxMM_{max,min}_{u,s}` * `iNNxMM_avgr_u` * Some wasm intrinsics have been removed: * `i64x2_trunc_*` * `f64x2_convert_*` * `i8x16_mul` * The `v8x16.shuffle` instruction is exposed. This is done through a `macro` (not `macro_rules!`, but `macro`). This is intended to be somewhat experimental and unstable until we decide otherwise. This instruction has 16 immediate-mode expressions and is as a result unsuited to the existing `constify_*` logic of this crate. I'm hoping that we can game out over time what a macro might look like and/or look for better solutions. For now, though, what's implemented is the first of its kind in this crate (an architecture-specific macro), so some extra scrutiny looking at it would be appreciated. * Lots of `assert_instr` annotations have been fixed for wasm. * All wasm simd128 tests are uncommented and passing now. This is still missing tests for new intrinsics and it's also missing tests for various corner cases. I hope to get to those later as the upstream spec itself gets closer to stabilization. In the meantime, however, I went ahead and updated the `hex.rs` example with a wasm implementation using intrinsics. With it I got some very impressive speedups using Wasmtime: test benches::large_default ... bench: 213,961 ns/iter (+/- 5,108) = 4900 MB/s test benches::large_fallback ... bench: 3,108,434 ns/iter (+/- 75,730) = 337 MB/s test benches::small_default ... bench: 52 ns/iter (+/- 0) = 2250 MB/s test benches::small_fallback ... bench: 358 ns/iter (+/- 0) = 326 MB/s or otherwise using Wasmtime hex encoding using SIMD is 15x faster on 1MB chunks or 7x faster on small <128byte chunks. All of these intrinsics are still unstable and will continue to be so presumably until the simd proposal in wasm itself progresses to a later stage. Additionaly we'll still want to sync with clang on intrinsic names (or decide not to) at some point in the future. * wasm: Unconditionally expose SIMD functions This commit unconditionally exposes SIMD functions from the `wasm32` module. This is done in such a way that the standard library does not need to be recompiled to access SIMD intrinsics and use them. This, hopefully, is the long-term story for SIMD in WebAssembly in Rust. It's unlikely that all WebAssembly runtimes will end up implementing SIMD so the standard library is unlikely to use SIMD any time soon, but we want to make sure it's easily available to folks! This commit enables all this by ensuring that SIMD is available to the standard library, regardless of compilation flags. This'll come with the same caveats as x86 support, where it doesn't make sense to call these functions unless you're enabling simd support one way or another locally. Additionally, as with x86, if you don't call these functions then the instructions won't show up in your binary. While I was here I went ahead and expanded the WebAssembly-specific documentation for the wasm32 module as well, ensuring that the current state of SIMD/Atomics are documented.
226 lines
7.5 KiB
Rust
226 lines
7.5 KiB
Rust
//! Implementation of the `#[assert_instr]` macro
|
|
//!
|
|
//! This macro is used when testing the `stdarch` crate and is used to generate
|
|
//! test cases to assert that functions do indeed contain the instructions that
|
|
//! we're expecting them to contain.
|
|
//!
|
|
//! The procedural macro here is relatively simple, it simply appends a
|
|
//! `#[test]` function to the original token stream which asserts that the
|
|
//! function itself contains the relevant instruction.
|
|
|
|
extern crate proc_macro;
|
|
extern crate proc_macro2;
|
|
#[macro_use]
|
|
extern crate quote;
|
|
extern crate syn;
|
|
|
|
use proc_macro2::TokenStream;
|
|
use quote::ToTokens;
|
|
|
|
#[proc_macro_attribute]
|
|
pub fn assert_instr(
|
|
attr: proc_macro::TokenStream,
|
|
item: proc_macro::TokenStream,
|
|
) -> proc_macro::TokenStream {
|
|
let invoc = match syn::parse::<Invoc>(attr) {
|
|
Ok(s) => s,
|
|
Err(e) => return e.to_compile_error().into(),
|
|
};
|
|
let item = match syn::parse::<syn::Item>(item) {
|
|
Ok(s) => s,
|
|
Err(e) => return e.to_compile_error().into(),
|
|
};
|
|
let func = match item {
|
|
syn::Item::Fn(ref f) => f,
|
|
_ => panic!("must be attached to a function"),
|
|
};
|
|
|
|
let instr = &invoc.instr;
|
|
let name = &func.sig.ident;
|
|
|
|
// Disable assert_instr for x86 targets compiled with avx enabled, which
|
|
// causes LLVM to generate different intrinsics that the ones we are
|
|
// testing for.
|
|
let disable_assert_instr = std::env::var("STDARCH_DISABLE_ASSERT_INSTR").is_ok();
|
|
|
|
// If instruction tests are disabled avoid emitting this shim at all, just
|
|
// return the original item without our attribute.
|
|
if !cfg!(optimized) || disable_assert_instr {
|
|
return (quote! { #item }).into();
|
|
}
|
|
|
|
let instr_str = instr
|
|
.replace('.', "_")
|
|
.replace('/', "_")
|
|
.replace(':', "_")
|
|
.replace(char::is_whitespace, "");
|
|
let assert_name = syn::Ident::new(&format!("assert_{}_{}", name, instr_str), name.span());
|
|
// These name has to be unique enough for us to find it in the disassembly later on:
|
|
let shim_name = syn::Ident::new(
|
|
&format!("stdarch_test_shim_{}_{}", name, instr_str),
|
|
name.span(),
|
|
);
|
|
let mut inputs = Vec::new();
|
|
let mut input_vals = Vec::new();
|
|
let ret = &func.sig.output;
|
|
for arg in func.sig.inputs.iter() {
|
|
let capture = match *arg {
|
|
syn::FnArg::Typed(ref c) => c,
|
|
ref v => panic!(
|
|
"arguments must not have patterns: `{:?}`",
|
|
v.clone().into_token_stream()
|
|
),
|
|
};
|
|
let ident = match *capture.pat {
|
|
syn::Pat::Ident(ref i) => &i.ident,
|
|
_ => panic!("must have bare arguments"),
|
|
};
|
|
if let Some(&(_, ref tokens)) = invoc.args.iter().find(|a| *ident == a.0) {
|
|
input_vals.push(quote! { #tokens });
|
|
} else {
|
|
inputs.push(capture);
|
|
input_vals.push(quote! { #ident });
|
|
}
|
|
}
|
|
|
|
let attrs = func
|
|
.attrs
|
|
.iter()
|
|
.filter(|attr| {
|
|
attr.path
|
|
.segments
|
|
.first()
|
|
.expect("attr.path.segments.first() failed")
|
|
.ident
|
|
.to_string()
|
|
.starts_with("target")
|
|
})
|
|
.collect::<Vec<_>>();
|
|
let attrs = Append(&attrs);
|
|
|
|
// Use an ABI on Windows that passes SIMD values in registers, like what
|
|
// happens on Unix (I think?) by default.
|
|
let abi = if cfg!(windows) {
|
|
syn::LitStr::new("vectorcall", proc_macro2::Span::call_site())
|
|
} else {
|
|
syn::LitStr::new("C", proc_macro2::Span::call_site())
|
|
};
|
|
let shim_name_str = format!("{}{}", shim_name, assert_name);
|
|
let to_test = quote! {
|
|
#attrs
|
|
#[no_mangle]
|
|
#[inline(never)]
|
|
pub unsafe extern #abi fn #shim_name(#(#inputs),*) #ret {
|
|
// The compiler in optimized mode by default runs a pass called
|
|
// "mergefunc" where it'll merge functions that look identical.
|
|
// Turns out some intrinsics produce identical code and they're
|
|
// folded together, meaning that one just jumps to another. This
|
|
// messes up our inspection of the disassembly of this function and
|
|
// we're not a huge fan of that.
|
|
//
|
|
// To thwart this pass and prevent functions from being merged we
|
|
// generate some code that's hopefully very tight in terms of
|
|
// codegen but is otherwise unique to prevent code from being
|
|
// folded.
|
|
//
|
|
// This is avoided on Wasm32 right now since these functions aren't
|
|
// inlined which breaks our tests since each intrinsic looks like it
|
|
// calls functions. Turns out functions aren't similar enough to get
|
|
// merged on wasm32 anyway. This bug is tracked at
|
|
// rust-lang/rust#74320.
|
|
#[cfg(not(target_arch = "wasm32"))]
|
|
::stdarch_test::_DONT_DEDUP.store(
|
|
std::mem::transmute(#shim_name_str.as_bytes().as_ptr()),
|
|
std::sync::atomic::Ordering::Relaxed,
|
|
);
|
|
#name(#(#input_vals),*)
|
|
}
|
|
};
|
|
|
|
let tokens: TokenStream = quote! {
|
|
#[test]
|
|
#[allow(non_snake_case)]
|
|
fn #assert_name() {
|
|
#to_test
|
|
|
|
// Make sure that the shim is not removed by leaking it to unknown
|
|
// code:
|
|
unsafe { llvm_asm!("" : : "r"(#shim_name as usize) : "memory" : "volatile") };
|
|
|
|
::stdarch_test::assert(#shim_name as usize,
|
|
stringify!(#shim_name),
|
|
#instr);
|
|
}
|
|
};
|
|
|
|
let tokens: TokenStream = quote! {
|
|
#item
|
|
#tokens
|
|
};
|
|
tokens.into()
|
|
}
|
|
|
|
struct Invoc {
|
|
instr: String,
|
|
args: Vec<(syn::Ident, syn::Expr)>,
|
|
}
|
|
|
|
impl syn::parse::Parse for Invoc {
|
|
fn parse(input: syn::parse::ParseStream) -> syn::Result<Self> {
|
|
use syn::{ext::IdentExt, Token};
|
|
|
|
let mut instr = String::new();
|
|
while !input.is_empty() {
|
|
if input.parse::<Token![,]>().is_ok() {
|
|
break;
|
|
}
|
|
if let Ok(ident) = syn::Ident::parse_any(input) {
|
|
instr.push_str(&ident.to_string());
|
|
continue;
|
|
}
|
|
if input.parse::<Token![.]>().is_ok() {
|
|
instr.push_str(".");
|
|
continue;
|
|
}
|
|
if let Ok(s) = input.parse::<syn::LitStr>() {
|
|
instr.push_str(&s.value());
|
|
continue;
|
|
}
|
|
println!("{:?}", input.cursor().token_stream());
|
|
return Err(input.error("expected an instruction"));
|
|
}
|
|
if instr.is_empty() {
|
|
return Err(input.error("expected an instruction before comma"));
|
|
}
|
|
let mut args = Vec::new();
|
|
while !input.is_empty() {
|
|
let name = input.parse::<syn::Ident>()?;
|
|
input.parse::<Token![=]>()?;
|
|
let expr = input.parse::<syn::Expr>()?;
|
|
args.push((name, expr));
|
|
|
|
if input.parse::<Token![,]>().is_err() {
|
|
if !input.is_empty() {
|
|
return Err(input.error("extra tokens at end"));
|
|
}
|
|
break;
|
|
}
|
|
}
|
|
Ok(Self { instr, args })
|
|
}
|
|
}
|
|
|
|
struct Append<T>(T);
|
|
|
|
impl<T> quote::ToTokens for Append<T>
|
|
where
|
|
T: Clone + IntoIterator,
|
|
T::Item: quote::ToTokens,
|
|
{
|
|
fn to_tokens(&self, tokens: &mut proc_macro2::TokenStream) {
|
|
for item in self.0.clone() {
|
|
item.to_tokens(tokens);
|
|
}
|
|
}
|
|
}
|