sayantn
ed1df99f03
Added support for AMD verification
...
Added a custom cpuid file for sde, which enables SSE4a, XOP, TBM and VP2INTERSECT. Fixed `xsave` tests
2024-06-30 21:45:56 +02:00
sayantn
fd948ee99d
Updates SDE
...
Updated SDE to v9.33.0
Disabled `assert-instr` in emulated run
2024-06-30 21:45:56 +02:00
Tobias Decking
fcee4d8b16
Define remaining IFMA intrinsics
2024-06-30 15:47:18 +02:00
Tobias Decking
a56cc86a23
Use generic simd for avx512 leading zeros
2024-06-30 15:17:50 +02:00
Tobias Decking
d1004e0abd
Refactor avx512f: mask operations
2024-06-30 14:55:25 +02:00
Tobias Decking
9f96670b7c
Refactor avx512f: element extraction
2024-06-30 14:55:25 +02:00
Tobias Decking
9a1d758f03
Refactor avx512f: floating point abs
2024-06-30 14:55:25 +02:00
Tobias Decking
2c81a7ae33
Refactor avx512f: zeroing primitives
2024-06-30 14:55:25 +02:00
Tobias Decking
f5219be7ee
Refactor avx512f: integer comparison
2024-06-30 14:55:25 +02:00
Tobias Decking
883cedc230
Refactor avx512f: integers
2024-06-30 14:55:25 +02:00
Tobias Decking
0d9520dfd4
Refactor avx512f: sqrt + rounding fix
2024-06-30 14:55:25 +02:00
Tobias Decking
53ca30a4c8
Refactor avx512f: rounding fma
2024-06-30 14:55:25 +02:00
Tobias Decking
128866c97b
Refactor avx512f: fma
2024-06-30 14:55:25 +02:00
Jubilee Young
8b77e779cb
Remove has_cpuid
2024-06-29 19:38:42 +02:00
sayantn
d7ea407a28
Fixing CI
...
Fixed x86_64-apple-darwin freezing.
Bump all docker to Ubuntu-24.04 (except for emulated and armv7)
2024-06-29 19:16:48 +02:00
sayantn
818df2f7d0
Some fixes as asked by @Amanieu
2024-06-29 19:16:48 +02:00
sayantn
95d273aaf9
Fixed _mm512_kunpackb
, reduce-max and reduce-min
...
`_mm512_kunpackb` was implemented wrong, and `simd_reduce_max` uses `maxnum` for comparison, which adheres to IEEE754, but Intel specifically says that they do NOT adhere to IEEE754 for NaNs, which can give wrong results
2024-06-29 19:16:48 +02:00
sayantn
fa22a9aeda
Add the missing BMI1, SSE2, SSE4.1 and AVX2 intrinsics
2024-06-29 19:16:48 +02:00
sayantn
d65d1a8ae6
Fixed some more intrinsics
...
Added some tests, Fixed incorrect target-features, and verification code for target-features. Removed all MMX support from verification.
2024-06-29 19:16:48 +02:00
sayantn
ad7cf91833
Fixed many intrinsics
...
fixed reduce-add and reduce-mul. and load/store of mask32 and mask64. added preserves-flags to mov asm. fixed the missing list. fixed `_mm_loadu_si64`. Added `assert_instr`
2024-06-29 19:16:48 +02:00
sayantn
043f3cc280
Upgraded disassembly to include windows-gnu
targets
2024-06-29 19:16:48 +02:00
sayantn
d26d3a7481
Update Intrinsics list
...
Updated the intrinsics list from version 3.4 to 3.6.8. Added a missing-x86.md file to track progress.
2024-06-29 19:16:48 +02:00
Mathilda
e2d9ac5145
Fix documentation of arguments of function core::arch::x86::_mm_blendv_epi8
2024-06-27 16:37:53 +02:00
Jayesskay
8e3abdc290
Fix _mm256_bsrli_epi128 producing invalid lower lane when IMM8 = 15
2024-06-27 16:03:46 +02:00
daxpedda
4d4fca0ac5
Assign Rust v1.81.0
2024-06-27 15:40:34 +02:00
daxpedda
1dee2095c1
Add unsigned aliases
2024-06-27 15:40:34 +02:00
daxpedda
23ad39c916
Stabilize Wasm relaxed SIMD
2024-06-27 15:40:34 +02:00
sayantn
1f779b7b40
Added runtime detection
...
Expanded the cache size to 93 (we will need this in near future)
Fixed detection of VAES, GFNI and VPCLMULQDQ
Could not test with `cupid` because they do not support these yet
2024-06-23 10:36:46 +02:00
Tobias Decking
2fd58a7ac7
Use generic simd for avx512 popcnt
2024-06-23 10:14:32 +02:00
Ralf Jung
90d47e9c71
set asm attributes
2024-06-21 16:59:03 +02:00
Ralf Jung
5982c0838a
fix test_mm512_stream_ps test
2024-06-21 16:59:03 +02:00
Ralf Jung
5c0744a3e5
non-temporal stores: use inline assembly
2024-06-21 16:59:03 +02:00
Tobias Decking
36852a1264
Update avx2.rs
2024-06-21 14:52:17 +02:00
Ralf Jung
5e7dada0eb
addcarryx: use pointers of the right type
2024-06-21 11:59:57 +02:00
sayantn
8ee3cc779a
AVX512DQ: Fixes (Corrected some typos in tests, Removed intrinsics list as everything has been implemented)
2024-06-18 19:13:13 +02:00
sayantn
b21e02ad83
AVX512DQ: Fixes (Adding SSE target_feature for i586)
2024-06-18 19:13:13 +02:00
sayantn
1f4034ba50
AVX512DQ Part 7: FP-Class
2024-06-18 19:13:13 +02:00
sayantn
c0a49d908b
AVX512DQ Part 6: Reduce
2024-06-18 19:13:13 +02:00
sayantn
177b75bde5
AVX512DQ Part 6: Reduce
2024-06-18 19:13:13 +02:00
sayantn
52f8b0c1a9
AVX512DQ Part 5: Range. Fixed intrinsic verification.
2024-06-18 19:13:13 +02:00
sayantn
c052982434
AVX512DQ Part 4: Mask Registers and Multiply Low
2024-06-18 19:13:13 +02:00
sayantn
54ef05ac65
AVX512DQ : Fix errors in Part 2
2024-06-18 19:13:13 +02:00
sayantn
5d2e19f5b6
AVX512DQ Part 3: Convert Intrinsics
2024-06-18 19:13:13 +02:00
sayantn
3281ecd0da
AVX512DQ : Fix Instructions
2024-06-18 19:13:13 +02:00
sayantn
a0efee80a1
AVX512DQ : Fix
2024-06-18 19:13:13 +02:00
sayantn
dbb53389db
AVX512DQ : Fix : Added to mod.rs
2024-06-18 19:13:13 +02:00
sayantn
011c102479
AVX512DQ : Fix : Added to mod.rs
2024-06-18 19:13:13 +02:00
sayantn
9af0ddafac
AVX512DQ Part 2: Broadcast, Extract, Insert
2024-06-18 19:13:13 +02:00
sayantn
fd6d97c21e
AVX512DQ Part 1: Logical Operations (and, andn, or, xor) - tests and doc
2024-06-18 19:13:13 +02:00
sayantn
d316154255
AVX512DQ Part 1: Logical Operations (and, andn, or, xor)
2024-06-18 19:13:13 +02:00