Madhav Madhusoodanan
6c91fe59be
chore: allowing cast() function to allow implicity type conversion for
...
certain cases (like uint32_t to uint64_t)
extras: 1. added more C++ headers 2. typecasting integer constants (for
example, the MM_FROUND arguments) for type compatibility
2025-10-26 17:50:37 +05:30
Madhav Madhusoodanan
6ab76d81c6
chore: Ensuring "const" appears for constant arguments to intrinsics.
...
Extra changes: 1. Using "as _" to allow for implicit typecasting
2025-10-26 17:50:37 +05:30
Madhav Madhusoodanan
e7c94dcafb
feat: Fixed FP16 errors, made the loading function generation more
...
accurate
2025-10-26 17:50:37 +05:30
Madhav Madhusoodanan
b22467a4d2
chore: add better error handling when writing and compiling mod_{i}.cpp,
...
neatly organize C++ headers
2025-10-26 17:50:37 +05:30
Madhav Madhusoodanan
bae0e30160
chore: add compilation flags
2025-10-26 17:50:37 +05:30
Madhav Madhusoodanan
b8ffa6b4da
chore: adding comments about memory alignment of variables and bash scripts that will be used in CI
2025-10-26 17:50:03 +05:30
Madhav Madhusoodanan
bb2a9fc0e5
chore: revert default target
2025-10-26 17:49:44 +05:30
Madhav Madhusoodanan
67ba9ec177
fix: vec_len -> simd_len (an error was present due to setting vec_len instead of simd_len for AVX register types)
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
6e2c8af78b
feat: correcting errors with generated C artifacts
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
0c40b9490c
fixed errors that caused errors with cpp file generation (un-handled
...
edge cases for Vector and Mask types)
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
f621ff8ae1
chore: update x86 module, removed intrinsicDefinition trait, formatting
...
updates
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
829933a996
debug: printing self incase print_result_c fails.
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
bcbb6d46d9
feat: add 8x8 case for get_lane_function for 64-bit vector
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
d4bc29a077
feat: handled extraction for 64-bit vector elements
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
1d9aed0f2a
chore: update c_prefix for mask and print_result_c for vector type
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
b736008c2e
feat: implemented get_lane_function for x86
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
8849eebc3b
feat: implemented print_result_c in the case the target type is
...
Mask-based
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
662c5b1b1f
fix: remove unused imports
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
cdb9d86c3e
fix: more support for Mask types
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
bfe1e01e10
fix: correcting semantical logic for setting vec_len
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
abdeddef4f
fix: set default value for varname and type fields of the
...
parameters/return value of an intrinsic
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
08dda1502d
fix: update arch flags being sent to the x86 compilation command
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
6264634a73
feat: implement print_result_c for Intrinsic<X86IntrinsicType>
2025-10-26 17:47:48 +05:30
Madhav Madhusoodanan
d54464ab87
feat: implemented compare_outputs of x86 module
2025-10-26 17:47:48 +05:30
Madhav Madhusoodanan
962dcfd7b1
feat: implemented build_rust_file of x86 module
2025-10-26 17:47:48 +05:30
Madhav Madhusoodanan
8deed38593
chore: added Regex crate, updated the structure of X86IntrinsicType
...
struct
2025-10-26 17:47:48 +05:30
Madhav Madhusoodanan
e6d4838de7
fix: code cleanup
2025-10-26 17:47:48 +05:30
Madhav Madhusoodanan
9e8b542723
feat: update building C code for x86 architecture.
...
Notes: 1. chunk_info has been moved to `common/mod.rs` since it will be
needed for all architectures
2025-10-26 17:47:48 +05:30
Madhav Madhusoodanan
9eb0ff4296
feat: updated intrinsics creation
2025-10-26 17:47:48 +05:30
Madhav Madhusoodanan
1f9a2e7d46
feat: added the XML intrinsic parser for x86
2025-10-26 17:47:48 +05:30
Madhav Madhusoodanan
f44a98a59d
feat: added the skeleton structure of the x86 module
2025-10-26 17:47:48 +05:30
Folkert de Vries
cf1cf2e94d
remove a use of core::intrinsics::size_of
...
use of the intrinsic, rather than the stable function, is probably an accident.
2025-10-25 23:57:17 +02:00
Amanieu d'Antras
d64b23c061
Merge pull request #1945 from folkertdev/gfni-cleanup
...
use `byte_add` in gfni tests
2025-10-25 14:17:49 +00:00
Folkert de Vries
9ebee4853d
use byte_add in gfni tests
2025-10-25 01:55:37 +02:00
Folkert de Vries
8dff65f010
Merge pull request #1938 from linkmauve/fjcvtzs
...
Implement fjcvtzs under the name __jcvt like the C intrinsic
2025-10-10 14:13:13 +00:00
Emmanuel Gil Peyrot
6039ddea09
Implement fjcvtzs under the name __jcvt like the C intrinsic
...
This instruction is only available when the jsconv target_feature is available,
so on ARMv8.3 or higher.
It is used e.g. by Ruffle[0] to speed up its conversion from f64 to i32, or by
any JS engine probably.
I’ve picked the stdarch_aarch64_jscvt feature because it’s the name of the
FEAT_JSCVT, but hesitated with naming it stdarch_aarch64_jsconv (the name of
the target_feature) or stdarch_aarch64_jcvt (the name of the C intrinsic) or
stdarch_aarch64_fjcvtzs (the name of the instruction), this choice is purely
arbitrary and I guess it could be argued one way or another. I wouldn’t expect
it to stay unstable for too long, so ultimately this shouldn’t matter much.
This feature is now tracked in this issue[1].
[0] https://github.com/ruffle-rs/ruffle/pull/21780
[1] https://github.com/rust-lang/rust/issues/147555
2025-10-10 13:29:42 +00:00
Sayantan Chakraborty
01dc34d709
Merge pull request #1939 from folkertdev/crc-remove-not-arm
...
crc32: remove `#[cfg(not(target_arch = "arm"))]` from aarch64 crc functions
2025-10-09 17:37:09 +00:00
Folkert de Vries
4fcf3f86c4
crc32: remove #[cfg(not(target_arch = "arm"))] from crc functions
...
They are defined in the aarch64 module, so this cfg is pointless.
Note that these instructions do exist for arm, but the aarch64 ones are
already stable, so this would need some additional work to implement
them for arm.
2025-10-09 19:20:20 +02:00
Folkert de Vries
27866a7f06
Merge pull request #1937 from sayantn/intrinsic-fixes
...
use simd intrinsics for `vec_max` and `vec_min`
2025-10-08 11:17:58 +00:00
sayantn
40ce617b2a
use simd intrinsics for vec_max and vec_min
2025-10-08 16:01:08 +05:30
Tsukasa OI
af91b45726
RISC-V: Use symbolic instructions on inline assembly (part 1)
...
While many intrinsics use `.insn` to generate raw machine code from
numbers, all ratified instructions can be symbolic
using `.option` directives.
By saving the assembler environment with `.option push` then modifying
the architecture with `.option arch`, we can temporarily enable certain
extensions (as we use `.option pop` immediately after the target
instruction, surrounding environment is completely intact in this
commit; *almost* completely intact in general).
This commit modifies the `pause` *hint* intrinsic to use symbolic
*instruction* because we want to expose it even if the Zihintpause
extension is unavailable on the target.
2025-10-06 01:08:42 +00:00
Amanieu d'Antras
09c43ef6d3
Merge pull request #1929 from sayantn/non-temporal
...
Fixes for non-temporal intrinsics
2025-10-05 22:44:09 +00:00
sayantn
c0e41518d1
Add comments in NT asm blocks for future reference
2025-10-05 07:04:36 +05:30
sayantn
5bf53654c5
Add _mm_sfence to all non-temporal intrinsic tests
2025-10-05 06:56:49 +05:30
sayantn
b29308c167
Use Inline ASM for SSE4a nontemporal stores
2025-10-05 06:56:46 +05:30
sayantn
28cf2d1a6c
Fix xsave segfaults
2025-10-05 05:39:29 +05:30
Sayantan Chakraborty
7e850c5f1e
Merge pull request #1932 from sayantn/fmaddsub
...
Use SIMD intrinsics for `vfmaddsubph` and `vfmsubaddph`
2025-10-04 00:43:02 +00:00
Amanieu d'Antras
14b888574f
Merge pull request #1931 from sayantn/use-intrinsics
...
Fix mistake in #1928
2025-10-03 13:10:34 +00:00
sayantn
f90d9ec8b2
Use SIMD intrinsics for vfmaddsubph and vfmsubaddph
2025-10-03 05:33:13 +05:30
sayantn
37605b03c5
Ensure simd_funnel_sh{l,r} always gets passed shift amounts in range
2025-10-03 03:51:34 +05:30