1318 Commits

Author SHA1 Message Date
sayantn
ed1df99f03 Added support for AMD verification
Added a custom cpuid file for sde, which enables SSE4a, XOP, TBM and VP2INTERSECT. Fixed `xsave` tests
2024-06-30 21:45:56 +02:00
sayantn
fd948ee99d Updates SDE
Updated SDE to v9.33.0
Disabled `assert-instr` in emulated run
2024-06-30 21:45:56 +02:00
Tobias Decking
fcee4d8b16 Define remaining IFMA intrinsics 2024-06-30 15:47:18 +02:00
Tobias Decking
a56cc86a23 Use generic simd for avx512 leading zeros 2024-06-30 15:17:50 +02:00
Tobias Decking
d1004e0abd Refactor avx512f: mask operations 2024-06-30 14:55:25 +02:00
Tobias Decking
9f96670b7c Refactor avx512f: element extraction 2024-06-30 14:55:25 +02:00
Tobias Decking
9a1d758f03 Refactor avx512f: floating point abs 2024-06-30 14:55:25 +02:00
Tobias Decking
2c81a7ae33 Refactor avx512f: zeroing primitives 2024-06-30 14:55:25 +02:00
Tobias Decking
f5219be7ee Refactor avx512f: integer comparison 2024-06-30 14:55:25 +02:00
Tobias Decking
883cedc230 Refactor avx512f: integers 2024-06-30 14:55:25 +02:00
Tobias Decking
0d9520dfd4 Refactor avx512f: sqrt + rounding fix 2024-06-30 14:55:25 +02:00
Tobias Decking
53ca30a4c8 Refactor avx512f: rounding fma 2024-06-30 14:55:25 +02:00
Tobias Decking
128866c97b Refactor avx512f: fma 2024-06-30 14:55:25 +02:00
Jubilee Young
8b77e779cb Remove has_cpuid 2024-06-29 19:38:42 +02:00
sayantn
d7ea407a28 Fixing CI
Fixed x86_64-apple-darwin freezing.
Bump all docker to Ubuntu-24.04 (except for emulated and armv7)
2024-06-29 19:16:48 +02:00
sayantn
818df2f7d0 Some fixes as asked by @Amanieu 2024-06-29 19:16:48 +02:00
sayantn
95d273aaf9 Fixed _mm512_kunpackb, reduce-max and reduce-min
`_mm512_kunpackb` was implemented wrong, and `simd_reduce_max` uses `maxnum` for comparison, which adheres to IEEE754, but Intel specifically says that they do NOT adhere to IEEE754 for NaNs, which can give wrong results
2024-06-29 19:16:48 +02:00
sayantn
fa22a9aeda Add the missing BMI1, SSE2, SSE4.1 and AVX2 intrinsics 2024-06-29 19:16:48 +02:00
sayantn
d65d1a8ae6 Fixed some more intrinsics
Added some tests, Fixed incorrect target-features, and verification code for target-features. Removed all MMX support from verification.
2024-06-29 19:16:48 +02:00
sayantn
ad7cf91833 Fixed many intrinsics
fixed reduce-add and reduce-mul. and load/store of mask32 and mask64. added preserves-flags to mov asm. fixed the missing list. fixed `_mm_loadu_si64`. Added `assert_instr`
2024-06-29 19:16:48 +02:00
sayantn
043f3cc280 Upgraded disassembly to include windows-gnu targets 2024-06-29 19:16:48 +02:00
sayantn
d26d3a7481 Update Intrinsics list
Updated the intrinsics list from version 3.4 to 3.6.8. Added a missing-x86.md file to track progress.
2024-06-29 19:16:48 +02:00
Mathilda
e2d9ac5145 Fix documentation of arguments of function core::arch::x86::_mm_blendv_epi8 2024-06-27 16:37:53 +02:00
Jayesskay
8e3abdc290 Fix _mm256_bsrli_epi128 producing invalid lower lane when IMM8 = 15 2024-06-27 16:03:46 +02:00
daxpedda
4d4fca0ac5 Assign Rust v1.81.0 2024-06-27 15:40:34 +02:00
daxpedda
1dee2095c1 Add unsigned aliases 2024-06-27 15:40:34 +02:00
daxpedda
23ad39c916 Stabilize Wasm relaxed SIMD 2024-06-27 15:40:34 +02:00
sayantn
1f779b7b40 Added runtime detection
Expanded the cache size to 93 (we will need this in near future)
Fixed detection of VAES, GFNI and VPCLMULQDQ
Could not test with `cupid` because they do not support these yet
2024-06-23 10:36:46 +02:00
Tobias Decking
2fd58a7ac7 Use generic simd for avx512 popcnt 2024-06-23 10:14:32 +02:00
Ralf Jung
90d47e9c71 set asm attributes 2024-06-21 16:59:03 +02:00
Ralf Jung
5982c0838a fix test_mm512_stream_ps test 2024-06-21 16:59:03 +02:00
Ralf Jung
5c0744a3e5 non-temporal stores: use inline assembly 2024-06-21 16:59:03 +02:00
Tobias Decking
36852a1264 Update avx2.rs 2024-06-21 14:52:17 +02:00
Ralf Jung
5e7dada0eb addcarryx: use pointers of the right type 2024-06-21 11:59:57 +02:00
sayantn
8ee3cc779a AVX512DQ: Fixes (Corrected some typos in tests, Removed intrinsics list as everything has been implemented) 2024-06-18 19:13:13 +02:00
sayantn
b21e02ad83 AVX512DQ: Fixes (Adding SSE target_feature for i586) 2024-06-18 19:13:13 +02:00
sayantn
1f4034ba50 AVX512DQ Part 7: FP-Class 2024-06-18 19:13:13 +02:00
sayantn
c0a49d908b AVX512DQ Part 6: Reduce 2024-06-18 19:13:13 +02:00
sayantn
177b75bde5 AVX512DQ Part 6: Reduce 2024-06-18 19:13:13 +02:00
sayantn
52f8b0c1a9 AVX512DQ Part 5: Range. Fixed intrinsic verification. 2024-06-18 19:13:13 +02:00
sayantn
c052982434 AVX512DQ Part 4: Mask Registers and Multiply Low 2024-06-18 19:13:13 +02:00
sayantn
54ef05ac65 AVX512DQ : Fix errors in Part 2 2024-06-18 19:13:13 +02:00
sayantn
5d2e19f5b6 AVX512DQ Part 3: Convert Intrinsics 2024-06-18 19:13:13 +02:00
sayantn
3281ecd0da AVX512DQ : Fix Instructions 2024-06-18 19:13:13 +02:00
sayantn
a0efee80a1 AVX512DQ : Fix 2024-06-18 19:13:13 +02:00
sayantn
dbb53389db AVX512DQ : Fix : Added to mod.rs 2024-06-18 19:13:13 +02:00
sayantn
011c102479 AVX512DQ : Fix : Added to mod.rs 2024-06-18 19:13:13 +02:00
sayantn
9af0ddafac AVX512DQ Part 2: Broadcast, Extract, Insert 2024-06-18 19:13:13 +02:00
sayantn
fd6d97c21e AVX512DQ Part 1: Logical Operations (and, andn, or, xor) - tests and doc 2024-06-18 19:13:13 +02:00
sayantn
d316154255 AVX512DQ Part 1: Logical Operations (and, andn, or, xor) 2024-06-18 19:13:13 +02:00