1984 Commits

Author SHA1 Message Date
Madhav Madhusoodanan
c16d05191c feat: moved cast<T1, T2> to architecture-specific definitions 2025-10-26 17:51:07 +05:30
Madhav Madhusoodanan
c2c3de09a7 chore: clean up unused variables 2025-10-26 17:51:07 +05:30
Madhav Madhusoodanan
1a2aacb46e chore: corrected the imm-width correction location for _mm_mpsadbw_epu8
intrinsic
2025-10-26 17:50:38 +05:30
Madhav Madhusoodanan
e00cfd2f67 feat: defined more load functions that are natively not defined (such as
arguments with UI16 etype and __m128d type)
2025-10-26 17:50:38 +05:30
Madhav Madhusoodanan
115dc3c298 chore: accomodate for immwidth field for constraints
extras: 1. call update_simd_len() after inferring bit_len for arguments
of certain intrinsics

2. handle the effective bit_len for _mm_mpsadbw_epu8 intrinsic's `imm8`
argument which has only 3 bits that are used
2025-10-26 17:50:38 +05:30
Madhav Madhusoodanan
fbe9a25287 chore: vector types cannot be the type of an individual element in an
array.

Extra: 1. Added better load fuctions 2. Added an update_simd_len()
function to support cases where the bit_len of the element need to be
inferred from its partner arguments before calculating the simd_len
2025-10-26 17:50:38 +05:30
Madhav Madhusoodanan
9c2dd24bb6 feat: filter for duplicates in the definition of intrinsics 2025-10-26 17:50:38 +05:30
Madhav Madhusoodanan
4cbca6c247 chore: corrected the legal range of values for constrained arguments
such as _MM_FROUND_SAE and _MM_ROUND_MODE
2025-10-26 17:50:38 +05:30
Madhav Madhusoodanan
c59e6702d3 feat: updated with debug printing and ostream implementation for vector
types
2025-10-26 17:50:38 +05:30
Madhav Madhusoodanan
ab6c74c3a3 feat: matching the expected number of elements for array to load
arguments, accommodating for signed variables too
2025-10-26 17:50:38 +05:30
Madhav Madhusoodanan
6c91fe59be chore: allowing cast() function to allow implicity type conversion for
certain cases (like uint32_t to uint64_t)

extras: 1. added more C++ headers 2. typecasting integer constants (for
example, the MM_FROUND arguments) for type compatibility
2025-10-26 17:50:37 +05:30
Madhav Madhusoodanan
6ab76d81c6 chore: Ensuring "const" appears for constant arguments to intrinsics.
Extra changes: 1. Using "as _" to allow for implicit typecasting
2025-10-26 17:50:37 +05:30
Madhav Madhusoodanan
e7c94dcafb feat: Fixed FP16 errors, made the loading function generation more
accurate
2025-10-26 17:50:37 +05:30
Madhav Madhusoodanan
b22467a4d2 chore: add better error handling when writing and compiling mod_{i}.cpp,
neatly organize C++ headers
2025-10-26 17:50:37 +05:30
Madhav Madhusoodanan
bae0e30160 chore: add compilation flags 2025-10-26 17:50:37 +05:30
Madhav Madhusoodanan
b8ffa6b4da chore: adding comments about memory alignment of variables and bash scripts that will be used in CI 2025-10-26 17:50:03 +05:30
Madhav Madhusoodanan
bb2a9fc0e5 chore: revert default target 2025-10-26 17:49:44 +05:30
Madhav Madhusoodanan
67ba9ec177 fix: vec_len -> simd_len (an error was present due to setting vec_len instead of simd_len for AVX register types) 2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
6e2c8af78b feat: correcting errors with generated C artifacts 2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
0c40b9490c fixed errors that caused errors with cpp file generation (un-handled
edge cases for Vector and Mask types)
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
f621ff8ae1 chore: update x86 module, removed intrinsicDefinition trait, formatting
updates
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
829933a996 debug: printing self incase print_result_c fails. 2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
bcbb6d46d9 feat: add 8x8 case for get_lane_function for 64-bit vector 2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
d4bc29a077 feat: handled extraction for 64-bit vector elements 2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
1d9aed0f2a chore: update c_prefix for mask and print_result_c for vector type 2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
b736008c2e feat: implemented get_lane_function for x86 2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
8849eebc3b feat: implemented print_result_c in the case the target type is
Mask-based
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
662c5b1b1f fix: remove unused imports 2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
cdb9d86c3e fix: more support for Mask types 2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
bfe1e01e10 fix: correcting semantical logic for setting vec_len 2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
abdeddef4f fix: set default value for varname and type fields of the
parameters/return value of an intrinsic
2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
08dda1502d fix: update arch flags being sent to the x86 compilation command 2025-10-26 17:48:20 +05:30
Madhav Madhusoodanan
6264634a73 feat: implement print_result_c for Intrinsic<X86IntrinsicType> 2025-10-26 17:47:48 +05:30
Madhav Madhusoodanan
d54464ab87 feat: implemented compare_outputs of x86 module 2025-10-26 17:47:48 +05:30
Madhav Madhusoodanan
962dcfd7b1 feat: implemented build_rust_file of x86 module 2025-10-26 17:47:48 +05:30
Madhav Madhusoodanan
8deed38593 chore: added Regex crate, updated the structure of X86IntrinsicType
struct
2025-10-26 17:47:48 +05:30
Madhav Madhusoodanan
e6d4838de7 fix: code cleanup 2025-10-26 17:47:48 +05:30
Madhav Madhusoodanan
9e8b542723 feat: update building C code for x86 architecture.
Notes: 1. chunk_info has been moved to `common/mod.rs` since it will be
needed for all architectures
2025-10-26 17:47:48 +05:30
Madhav Madhusoodanan
9eb0ff4296 feat: updated intrinsics creation 2025-10-26 17:47:48 +05:30
Madhav Madhusoodanan
1f9a2e7d46 feat: added the XML intrinsic parser for x86 2025-10-26 17:47:48 +05:30
Madhav Madhusoodanan
f44a98a59d feat: added the skeleton structure of the x86 module 2025-10-26 17:47:48 +05:30
Folkert de Vries
cf1cf2e94d
remove a use of core::intrinsics::size_of
use of the intrinsic, rather than the stable function, is probably an accident.
2025-10-25 23:57:17 +02:00
Amanieu d'Antras
d64b23c061
Merge pull request #1945 from folkertdev/gfni-cleanup
use `byte_add` in gfni tests
2025-10-25 14:17:49 +00:00
Folkert de Vries
9ebee4853d
use byte_add in gfni tests 2025-10-25 01:55:37 +02:00
Folkert de Vries
8dff65f010
Merge pull request #1938 from linkmauve/fjcvtzs
Implement fjcvtzs under the name __jcvt like the C intrinsic
2025-10-10 14:13:13 +00:00
Emmanuel Gil Peyrot
6039ddea09 Implement fjcvtzs under the name __jcvt like the C intrinsic
This instruction is only available when the jsconv target_feature is available,
so on ARMv8.3 or higher.

It is used e.g. by Ruffle[0] to speed up its conversion from f64 to i32, or by
any JS engine probably.

I’ve picked the stdarch_aarch64_jscvt feature because it’s the name of the
FEAT_JSCVT, but hesitated with naming it stdarch_aarch64_jsconv (the name of
the target_feature) or stdarch_aarch64_jcvt (the name of the C intrinsic) or
stdarch_aarch64_fjcvtzs (the name of the instruction), this choice is purely
arbitrary and I guess it could be argued one way or another.  I wouldn’t expect
it to stay unstable for too long, so ultimately this shouldn’t matter much.

This feature is now tracked in this issue[1].

[0] https://github.com/ruffle-rs/ruffle/pull/21780
[1] https://github.com/rust-lang/rust/issues/147555
2025-10-10 13:29:42 +00:00
Sayantan Chakraborty
01dc34d709
Merge pull request #1939 from folkertdev/crc-remove-not-arm
crc32: remove `#[cfg(not(target_arch = "arm"))]` from aarch64 crc functions
2025-10-09 17:37:09 +00:00
Folkert de Vries
4fcf3f86c4
crc32: remove #[cfg(not(target_arch = "arm"))] from crc functions
They are defined in the aarch64 module, so this cfg is pointless.

Note that these instructions do exist for arm, but the aarch64 ones are
already stable, so this would need some additional work to implement
them for arm.
2025-10-09 19:20:20 +02:00
Folkert de Vries
27866a7f06
Merge pull request #1937 from sayantn/intrinsic-fixes
use simd intrinsics for `vec_max` and `vec_min`
2025-10-08 11:17:58 +00:00
sayantn
40ce617b2a
use simd intrinsics for vec_max and vec_min 2025-10-08 16:01:08 +05:30