itsscb/rust - rust - Gitea: Git with a cup of tea

mirror of https://github.com/rust-lang/rust.git synced 2025-11-24 12:18:32 +00:00

Author	SHA1	Message	Date
sayantn	c0e41518d1	Add comments in NT asm blocks for future reference	2025-10-05 07:04:36 +05:30
sayantn	5bf53654c5	Add `_mm_sfence` to all non-temporal intrinsic tests	2025-10-05 06:56:49 +05:30
sayantn	b29308c167	Use Inline ASM for SSE4a nontemporal stores	2025-10-05 06:56:46 +05:30
Sayantan Chakraborty	7e850c5f1e	Merge pull request #1932 from sayantn/fmaddsub Use SIMD intrinsics for `vfmaddsubph` and `vfmsubaddph`	2025-10-04 00:43:02 +00:00
Amanieu d'Antras	14b888574f	Merge pull request #1931 from sayantn/use-intrinsics Fix mistake in #1928	2025-10-03 13:10:34 +00:00
sayantn	f90d9ec8b2	Use SIMD intrinsics for `vfmaddsubph` and `vfmsubaddph`	2025-10-03 05:33:13 +05:30
sayantn	37605b03c5	Ensure `simd_funnel_sh{l,r}` always gets passed shift amounts in range	2025-10-03 03:51:34 +05:30
sayantn	018f9927b2	Revert uses of SIMD intrinsics for shifts	2025-10-03 03:30:50 +05:30
Madhav Madhusoodanan	6b99d5fb56	fix: update the implementation of _kshiftri_mask16 and _kshiftli_mask16 to zero out when the amount of shift exceeds 16.	2025-10-03 02:33:11 +05:30
Madhav Madhusoodanan	0138b95620	fix: update the implementation of _kshiftri_mask8 and _kshiftli_mask8 to zero out when the amount of shift exceeds the bit length of the input argument.	2025-10-03 02:27:15 +05:30
Madhav Madhusoodanan	8b25ddeea3	fix: update the implementation of _kshiftri_mask32, _kshiftri_mask64, _kshiftli_mask32 and _kshiftli_mask64 to zero out when the amount of shift exceeds the bit length of the input argument.	2025-10-03 02:20:50 +05:30
sayantn	851c32abb2	Use SIMD intrinsics for `test{z,c}` intrinsics	2025-10-01 12:33:41 +05:30
sayantn	4c94e6bba9	Use SIMD intrinsics for `vperm2` intrinsics	2025-10-01 10:26:59 +05:30
sayantn	d23dbbec31	Use SIMD intrinsics for `cvtsi{,64}_{ss,sd}` intrinsics	2025-10-01 07:23:43 +05:30
sayantn	6460b35798	Use SIMD intrinsics for f16 intrinsics	2025-10-01 07:23:10 +05:30
sayantn	3f91ced840	Use SIMD intrinsics for shift and rotate intrinsics	2025-10-01 07:22:12 +05:30
sayantn	1819ae0c1f	Use SIMD intrinsics for `madd`, `hadd` and `hsub` intrinsics	2025-10-01 07:20:30 +05:30
sayantn	b55b085535	Remove uses of deprecated `llvm.x86.addcarryx.u{32,64}` intrinsics - Correct mistake in x86_64/adx.rs where it was not testing `_addcarryx` at all	2025-10-01 07:16:44 +05:30
usamoi	00c8866c57	pick changes from https://github.com/rust-lang/rust/pull/146683	2025-09-23 10:17:54 +08:00
usamoi	3b09522c34	Revert "Remove big-endian swizzles from `vreinterpret`" This reverts commit 24f89ca53d3374ed8d3e0cbadc1dc89eea41acba.	2025-09-23 10:05:32 +08:00
usamoi	39b2e433e6	intrinsic-test: test intrinsics with patched core_arch	2025-09-20 20:13:24 +08:00
Sayantan Chakraborty	c1242fab74	Merge pull request #1921 from a4lg/riscv-inline-asm-general-improvements RISC-V: Improvements of inline assembly uses	2025-09-15 18:39:49 +00:00
Folkert de Vries	5dd0fdcd67	Merge pull request #1919 from sayantn/fix-vreinterpret Remove big-endian swizzles from `vreinterpret`	2025-09-15 08:18:20 +00:00
Tsukasa OI	8df078a3f0	RISC-V: Improvements of inline assembly uses This commit performs various improvements (better register allocation, less register clobbering on the worst case and better readability) of RISC-V inline assembly use cases. Note that it does not change the `p` module (which defines the "P" extension draft instructions but very likely to change). 1. Use `lateout` as possible. Unlike `out(reg)` and `in(reg)` pair, `lateout(reg)` and `in(reg)` can share the same register because they state that the late-output register is written after all the reads are performed. It can improve register allocation. 2. Add `preserves_flags` option as possible. While RISC-V doesn't have _regular_ condition codes, RISC-V inline assembly in the Rust language assumes that some registers (mainly vector state registers) may be overwritten by default. By adding `preserves_flags` to the intrinsics corresponding instructions without overwriting them, it can minimize register clobbering on the worst case. 3. Use trailing semicolon. As `asm!` declares an action and it doesn't return a value by itself, it would be better to have trailing semicolon to denote that an `asm!` call is effectively a statement. 4. Make most of `asm!` calls multi-lined. `rustfmt` makes some simple (yet long) `asm!` calls multi-lined but it does not perform formatting of complex `asm!` calls with inputs and/or outputs. To keep consistency, it makes most of the `asm!` calls multi-lined.	2025-09-14 05:08:19 +00:00
Tsukasa OI	a3b7aad20f	stdarch-gen-arm: Make Clippy happy	2025-09-12 11:50:51 +00:00
Tsukasa OI	221eb1f0d5	intrinsic-test: Make Clippy happy	2025-09-12 11:50:25 +00:00
Sayantan Chakraborty	269cecc91c	Merge pull request #1918 from a4lg/riscv-aes64im-lower-requirements RISC-V: "Lower" requirements of `aes64im`	2025-09-11 19:59:18 +00:00
sayantn	bb31725e67	Remove big-endian swizzles from `vreinterpret`	2025-09-12 01:20:34 +05:30
Tsukasa OI	e54cc43867	RISC-V: "Lower" requirements of `aes64im` This instruction is incorrectly categorized as the same one as `aes64ks1i` and `aes64ks2` (that should require `zkne \|\| zknd` but currently require `zkne && zknd`) but `aes64im` only requires the Zknd extension. This commit fixes the category of this intrinsic (lowering the requirements from the Rust perspective but it does not actually lower it from the RISC-V perspective).	2025-09-11 06:42:10 +00:00
WANG Rui	614dab3ed2	loongarch: Align intrinsic signatures with LLVM	2025-09-10 23:10:19 +08:00
Folkert de Vries	4b549a7330	move target-specific definitions into constants	2025-09-07 14:11:02 +02:00
Folkert de Vries	ccec202727	move `build_c_file` and `build_rust_file` into `SupportedArchitectureTest`	2025-09-07 14:11:02 +02:00
Folkert de Vries	1697f36225	remove `trait IntrinsicDefinition`	2025-09-07 14:11:02 +02:00
Folkert de Vries	2ba0a6e489	move `print_result_c` into the trait	2025-09-07 14:11:02 +02:00
Folkert de Vries	9d9ca01bfa	move `print_result_c` into the inner intrinsic type	2025-09-07 14:11:02 +02:00
Folkert de Vries	916424f38d	move more constants into `SupportedArchitectureTest`	2025-09-07 14:11:01 +02:00
Folkert de Vries	6ab097b245	move platform headers into `SupportedArchitectureTest`	2025-09-07 14:11:01 +02:00
Folkert de Vries	d70ef4f0a7	move `compare_outputs` implementation into `SupportedArchitectureTest` definition	2025-09-07 14:11:01 +02:00
Folkert de Vries	93101b5783	s390x: use the new `u128::funnel_shl`	2025-09-06 14:32:36 +02:00
Folkert de Vries	e1a3b8bdc1	Merge pull request #1911 from nikic/remove-hack Remove some llvm workarounds	2025-09-03 13:16:03 +00:00
Tsukasa OI	4679533732	RISC-V: Lower requirements of `clmul` and `clmulh` They don't need full "Zbc" extension but only its subset: the "Zbkc" extension. Since the compiler implies `zbkc` from `zbc`, it's safe to use `#[target_feature(enable = "zbkc")]`.	2025-09-03 02:13:35 +00:00
Nikita Popov	18fa6d917c	Remove some llvm workarounds	2025-09-02 10:48:42 +02:00
Folkert de Vries	ae648be783	use `llvm.roundeven` on arm	2025-08-29 12:15:41 +02:00
Amanieu d'Antras	b2189b8ff6	Merge pull request #1903 from folkertdev/s390x-llvm-21-fixes `s390x` llvm 21 improvements	2025-08-21 20:31:06 +00:00
Folkert de Vries	98bd1d7445	use `simd_saturating_{add, sub}` on neon	2025-08-21 10:25:00 +02:00
Amanieu d'Antras	0b0c42478f	Merge pull request #1901 from folkertdev/wasm-read-unaligned wasm: use `{read, write}_unaligned` methods	2025-08-20 20:44:05 +00:00
Folkert de Vries	6d74280ae4	Merge pull request #1899 from dpaoliello/arm64ec Add testing for Arm64EC Windows	2025-08-20 20:42:51 +00:00
Folkert de Vries	45af206618	s390x: link to a missed optimization	2025-08-20 22:20:30 +02:00
Folkert de Vries	e9162f221a	s390x: implement `vec_sld` using `fshl`	2025-08-20 22:20:30 +02:00
Folkert de Vries	dfa95c6fa4	s390x: implement `vec_subc_u128` using `overflowing_sub`	2025-08-20 22:20:29 +02:00

1 2 3 4 5 ...

1931 Commits