itsscb/rust - rust - Gitea: Git with a cup of tea

mirror of https://github.com/rust-lang/rust.git synced 2025-12-04 18:41:46 +00:00

Author	SHA1	Message	Date
minybot	cf1adeba7a	Avx512f (#901 )	2020-09-11 22:26:39 +01:00
Jeff Muizelaar	d6e2546615	Add vgetq_lane_s32 (#903 )	2020-09-10 18:58:03 +01:00
Jeff Muizelaar	5b3d026e21	Add vld1q_s32 and vld1q_u32 (#899 )	2020-09-08 21:52:58 +01:00
Jeff Muizelaar	78c5f04228	Add vld1q_dup_f32 (#897 )	2020-09-08 14:39:56 +01:00
jethrogb	e8a9e43f93	Re-land mm_extract_epi fix (#898 ) This reverts commit 311d56cd91609c1c1c0370cbd2ece8e3048653a5. Co-authored-by: Jethro Beekman <jethro@fortanix.com>	2020-09-08 14:38:43 +01:00
minybot	3f982e086d	Avx512f (#896 )	2020-09-08 12:59:57 +01:00
Jeff Muizelaar	51ca88d3a6	Add vld1q_f32 (#892 ) The alignment requirements should match the pointer type. See llvm commit 8beaba13b8a61697008854b82ed3b45377af9d9d	2020-09-07 21:50:55 +01:00
Jeff Muizelaar	6f97356f7f	Reformat avx512 (#894 )	2020-09-07 20:45:20 +01:00
Caleb Zulawski	63af5f371c	Remove requirement on neon feature for arm (#893 )	2020-09-07 01:47:04 +01:00
minybot	b11ca63e7b	Avx512 (#891 )	2020-09-04 23:06:48 +01:00
Mateusz Mikuła	c06b820716	Bye bye MMX! (#890 )	2020-09-03 14:12:19 +01:00
minybot	3bbfade4c9	Avx512 (#887 )	2020-08-29 01:55:49 +01:00
Daniel Liu	da3ba684ce	Fixed typos in the docs for AVX2 subtraction (#886 )	2020-08-28 15:42:58 +01:00
Pietro Albini	43006f68bd	Remove cfg(not(bootstrap)) (#885 )	2020-08-27 01:07:03 +01:00
minybot	1edc72e825	add some avx512f intrinsics(mask, rotation, shift) (#884 )	2020-08-25 01:29:47 +01:00
Lokathor	67217c5d11	add more things that do adds (#881 ) Co-authored-by: Amanieu d'Antras <amanieu@gmail.com>	2020-08-10 00:03:35 +01:00
Samrat Man Singh	3e19b9879a	Fix typo in doc for `_mm256_permute2f128_si256` (#880 )	2020-08-03 16:33:35 +01:00
Georgio Nicolas	d7660eb8d5	Explain the discrepancy in the mask type for _mm_shuffle_ps (#879 )	2020-08-01 14:35:03 +01:00
Alex Crichton	9a3b159e83	Partially revert #868 (#878 ) This commit partially reverts #868 to restore the intrinsics to their original implementation to avoid breaking changes. This is done while rust-lang/rust#73166 is running through crater, and should unblock rust-lang/rust#74482.	2020-07-28 16:29:35 +00:00
Lokathor	ce4277d977	[Neon] Absolute Value fns (#877 )	2020-07-20 08:24:29 +01:00
bjorn3	b93f41cbb3	Constify all x86 rustc_args_required_const intrinsics (#876 )	2020-07-19 15:45:51 +01:00
Alex Crichton	770964adac	Update and revamp wasm32 SIMD intrinsics (#874 ) Lots of time and lots of things have happened since the simd128 support was first added to this crate. Things are starting to settle down now so this commit syncs the Rust intrinsic definitions with the current specification (https://github.com/WebAssembly/simd). Unfortuantely not everything can be enabled just yet but everything is in the pipeline for getting enabled soon. This commit also applies a major revamp to how intrinsics are tested. The intention is that the setup should be much more lightweight and/or easy to work with after this commit. At a high-level, the changes here are: * Testing with node.js and `#[wasm_bindgen]` has been removed. Instead intrinsics are tested with Wasmtime which has a nearly complete implementation of the SIMD spec (and soon fully complete!) * Testing is switched to `wasm32-wasi` to make idiomatic Rust bits a bit easier to work with (e.g. `panic!)` * Testing of this crate's simd128 feature for wasm is re-enabled. This will run on CI and both compile and execute intrinsics. This should bring wasm intrinsics to the same level of parity as x86 intrinsics, for example. * New wasm intrinsics have been added: * `iNNxMM_loadAxA_{s,u}` * `vNNxMM_load_splat` * `v8x16_swizzle` * `v128_andnot` * `iNNxMM_abs` * `iNNxMM_narrow__{u,s}` `iNNxMM_bitmask` - commented out until LLVM is updated to LLVM 11 * `iNNxMM_widen__{u,s}` - commented out until bytecodealliance/wasmtime#1994 lands `iNNxMM_{max,min}_{u,s}` * `iNNxMM_avgr_u` * Some wasm intrinsics have been removed: * `i64x2_trunc_` `f64x2_convert_` `i8x16_mul` * The `v8x16.shuffle` instruction is exposed. This is done through a `macro` (not `macro_rules!`, but `macro`). This is intended to be somewhat experimental and unstable until we decide otherwise. This instruction has 16 immediate-mode expressions and is as a result unsuited to the existing `constify_` logic of this crate. I'm hoping that we can game out over time what a macro might look like and/or look for better solutions. For now, though, what's implemented is the first of its kind in this crate (an architecture-specific macro), so some extra scrutiny looking at it would be appreciated. Lots of `assert_instr` annotations have been fixed for wasm. * All wasm simd128 tests are uncommented and passing now. This is still missing tests for new intrinsics and it's also missing tests for various corner cases. I hope to get to those later as the upstream spec itself gets closer to stabilization. In the meantime, however, I went ahead and updated the `hex.rs` example with a wasm implementation using intrinsics. With it I got some very impressive speedups using Wasmtime: test benches::large_default ... bench: 213,961 ns/iter (+/- 5,108) = 4900 MB/s test benches::large_fallback ... bench: 3,108,434 ns/iter (+/- 75,730) = 337 MB/s test benches::small_default ... bench: 52 ns/iter (+/- 0) = 2250 MB/s test benches::small_fallback ... bench: 358 ns/iter (+/- 0) = 326 MB/s or otherwise using Wasmtime hex encoding using SIMD is 15x faster on 1MB chunks or 7x faster on small <128byte chunks. All of these intrinsics are still unstable and will continue to be so presumably until the simd proposal in wasm itself progresses to a later stage. Additionaly we'll still want to sync with clang on intrinsic names (or decide not to) at some point in the future. * wasm: Unconditionally expose SIMD functions This commit unconditionally exposes SIMD functions from the `wasm32` module. This is done in such a way that the standard library does not need to be recompiled to access SIMD intrinsics and use them. This, hopefully, is the long-term story for SIMD in WebAssembly in Rust. It's unlikely that all WebAssembly runtimes will end up implementing SIMD so the standard library is unlikely to use SIMD any time soon, but we want to make sure it's easily available to folks! This commit enables all this by ensuring that SIMD is available to the standard library, regardless of compilation flags. This'll come with the same caveats as x86 support, where it doesn't make sense to call these functions unless you're enabling simd support one way or another locally. Additionally, as with x86, if you don't call these functions then the instructions won't show up in your binary. While I was here I went ahead and expanded the WebAssembly-specific documentation for the wasm32 module as well, ensuring that the current state of SIMD/Atomics are documented.	2020-07-18 13:32:52 +01:00
Ivan Tham	7f78306761	Add _mm_loadu_si64 (#870 ) Co-authored-by: Amanieu d'Antras <amanieu@gmail.com>	2020-07-16 18:01:46 +01:00
Daniel Smith	5bfcdc0d57	Implement AVX512f floating point comparisons (#869 ) Co-authored-by: bjorn3 <bjorn3@users.noreply.github.com>	2020-07-15 20:06:38 +01:00
Shamir Khodzha	78135e1774	added f32 and f64 unaligned stores and loads from avx512f set (#873 )	2020-07-11 09:02:07 +01:00
Daniel Smith	02e1736720	Fix or equals integer comparisons (#872 )	2020-07-04 05:41:25 +01:00
Daniel Smith	0108cb216a	Make function signatures consistent (#871 )	2020-07-04 03:27:06 +01:00
Daniel Smith	5ff50904d8	Add AVX 512f gather, scatter and compare intrinsics (#866 ) Co-authored-by: bjorn3 <bjorn3@users.noreply.github.com>	2020-06-16 17:49:21 +01:00
Jethro Beekman	a214956fe5	Fix x86 extract_epi{8,16} functions * Update Intel intrinsics definitions with the latest version * Update _mm256_extract_epi{8,16} to match latest definition * Fix _mm_extract_epi16 sign extension Fixes #867	2020-06-09 12:29:01 +01:00
Narek Galstyan	6f8baeb427	Clarify documentation about wasm32 target_feature gates	2020-06-04 09:01:01 +02:00
Daniel Smith	9b3358fc66	Add missing spaces	2020-05-31 19:46:40 +01:00
Daniel Smith	05cf0ce56b	s/unsigned/signed/ for epi64	2020-05-31 19:46:40 +01:00
Daniel Smith	dde41d5863	Fix comparison comments	2020-05-31 19:46:40 +01:00
Daniel Smith	e0d2a25d24	Add 64 bit AVX512f le and ge comparisons	2020-05-30 21:50:51 +01:00
Mahmut Bulut	f4cdbb3005	Disable bootstrap for stage0	2020-05-29 21:29:04 +01:00
Mahmut Bulut	4541757677	feature detection	2020-05-29 19:05:48 +01:00
Mahmut Bulut	5b8bd0661a	Fix cancellation code arithmetic	2020-05-29 19:05:48 +01:00
Mahmut Bulut	17e4b29dfd	Implementation for Aarch64 TME intrinsics	2020-05-29 19:05:48 +01:00
Daniel Smith	a50a216567	Add signed variants	2020-05-29 00:07:03 +01:00
Daniel Smith	d94bc946eb	Add gt and eq unsigned variants	2020-05-29 00:07:03 +01:00
Daniel Smith	22a73da688	Add mask variant to cmplt	2020-05-29 00:07:03 +01:00
Daniel Smith	b8e492f5a0	finish/fix adding avx512f to x86_64	2020-05-29 00:07:03 +01:00
Daniel Smith	fa03c0cdaf	rustfmt	2020-05-29 00:07:03 +01:00
Daniel Smith	c382acd251	Only check for the instruction prefix since MSVC and Clang use different instructions	2020-05-29 00:07:03 +01:00
Daniel Smith	ad2fe20a87	Use correct instruction	2020-05-29 00:07:03 +01:00
Daniel Smith	2d717c3623	Fix stdarch-verify test	2020-05-29 00:07:03 +01:00
Daniel Smith	7ab646ef42	Move 64 bit function based on stdarch-verify	2020-05-29 00:07:03 +01:00
Daniel Smith	48b086a827	Add __mmask8 type	2020-05-29 00:07:03 +01:00
Daniel Smith	e0ffa88fe7	Add one AVX512f comparison and the intrinsics needed to test it	2020-05-29 00:07:03 +01:00
Daniel Smith	7a29fcc1c8	Convert __mmask16 to use an unsigned type	2020-05-28 22:24:46 +01:00

1 2 3 4 5 ...

428 Commits