sayantn
1e8a22c374
Fix Documentation
2024-07-08 00:32:43 +02:00
sayantn
1da646fcab
Implement missing in SSE4a and TBM
...
Add `extracti`, `inserti` and `bextri` intrinsics. Refactor TBM into 2 modules
2024-07-07 19:55:04 +02:00
Tobias Decking
7378b35fd0
Use generic simd in wasm intrinsics
2024-07-07 19:21:10 +02:00
Tobias Decking
bbb2ba5424
Refactor avx512bw: reduction operations
2024-07-06 12:07:29 +02:00
Tobias Decking
fe0a378499
Refactor avx512bw: mask operations
2024-07-06 12:07:29 +02:00
Tobias Decking
198a91e5db
Refactor avx512bw: integer comparison
2024-07-06 12:07:29 +02:00
Tobias Decking
f1a1ec2921
Refactor avx512bw: max/min
2024-07-06 12:07:29 +02:00
Tobias Decking
9ad2a62245
Refactor avx512bw: saturating arithmetic
2024-07-06 12:07:29 +02:00
Tobias Decking
13063410dd
Refactor avx512bw: avg + mulhi + abs
2024-07-06 12:07:29 +02:00
sayantn
268ac7fe92
Add detection for SHA512, SM3 and SM4
...
Cannot cross-verify with `cupid` because they do not have these features yet.
2024-07-06 11:29:28 +02:00
sayantn
c862e4e487
Added a bf16 type
2024-07-06 11:00:34 +02:00
sayantn
70fbc2e97c
Implemented some missing functions
...
These cannot be linked with LLVM because of the lack of `bfloat16` and `i1` types in Rust. So, inline asm was the only way
2024-07-06 11:00:34 +02:00
sayantn
3de8e86491
Implemented the missing AVX512BF16 intrinsics
2024-07-06 11:00:34 +02:00
sayantn
f22fab559e
Implemented VEX versions
...
Modified stdarch-test to accept VEX versions
2024-07-06 11:00:34 +02:00
sayantn
775dcaabde
Implemented missing gather-scatters
2024-07-06 11:00:34 +02:00
sayantn
1c3b3b80c0
Fix the stream intrinsics
...
They should use a platform-specific address management.
2024-07-06 11:00:34 +02:00
Tobias Decking
1f3264848f
Fix incorrect reduction operations in avx512f
2024-07-02 12:19:20 +02:00
sayantn
ed1df99f03
Added support for AMD verification
...
Added a custom cpuid file for sde, which enables SSE4a, XOP, TBM and VP2INTERSECT. Fixed `xsave` tests
2024-06-30 21:45:56 +02:00
sayantn
fd948ee99d
Updates SDE
...
Updated SDE to v9.33.0
Disabled `assert-instr` in emulated run
2024-06-30 21:45:56 +02:00
Tobias Decking
fcee4d8b16
Define remaining IFMA intrinsics
2024-06-30 15:47:18 +02:00
Tobias Decking
a56cc86a23
Use generic simd for avx512 leading zeros
2024-06-30 15:17:50 +02:00
Tobias Decking
d1004e0abd
Refactor avx512f: mask operations
2024-06-30 14:55:25 +02:00
Tobias Decking
9f96670b7c
Refactor avx512f: element extraction
2024-06-30 14:55:25 +02:00
Tobias Decking
9a1d758f03
Refactor avx512f: floating point abs
2024-06-30 14:55:25 +02:00
Tobias Decking
2c81a7ae33
Refactor avx512f: zeroing primitives
2024-06-30 14:55:25 +02:00
Tobias Decking
f5219be7ee
Refactor avx512f: integer comparison
2024-06-30 14:55:25 +02:00
Tobias Decking
883cedc230
Refactor avx512f: integers
2024-06-30 14:55:25 +02:00
Tobias Decking
0d9520dfd4
Refactor avx512f: sqrt + rounding fix
2024-06-30 14:55:25 +02:00
Tobias Decking
53ca30a4c8
Refactor avx512f: rounding fma
2024-06-30 14:55:25 +02:00
Tobias Decking
128866c97b
Refactor avx512f: fma
2024-06-30 14:55:25 +02:00
Jubilee Young
8b77e779cb
Remove has_cpuid
2024-06-29 19:38:42 +02:00
sayantn
d7ea407a28
Fixing CI
...
Fixed x86_64-apple-darwin freezing.
Bump all docker to Ubuntu-24.04 (except for emulated and armv7)
2024-06-29 19:16:48 +02:00
sayantn
818df2f7d0
Some fixes as asked by @Amanieu
2024-06-29 19:16:48 +02:00
sayantn
95d273aaf9
Fixed _mm512_kunpackb, reduce-max and reduce-min
...
`_mm512_kunpackb` was implemented wrong, and `simd_reduce_max` uses `maxnum` for comparison, which adheres to IEEE754, but Intel specifically says that they do NOT adhere to IEEE754 for NaNs, which can give wrong results
2024-06-29 19:16:48 +02:00
sayantn
fa22a9aeda
Add the missing BMI1, SSE2, SSE4.1 and AVX2 intrinsics
2024-06-29 19:16:48 +02:00
sayantn
d65d1a8ae6
Fixed some more intrinsics
...
Added some tests, Fixed incorrect target-features, and verification code for target-features. Removed all MMX support from verification.
2024-06-29 19:16:48 +02:00
sayantn
ad7cf91833
Fixed many intrinsics
...
fixed reduce-add and reduce-mul. and load/store of mask32 and mask64. added preserves-flags to mov asm. fixed the missing list. fixed `_mm_loadu_si64`. Added `assert_instr`
2024-06-29 19:16:48 +02:00
sayantn
043f3cc280
Upgraded disassembly to include windows-gnu targets
2024-06-29 19:16:48 +02:00
sayantn
d26d3a7481
Update Intrinsics list
...
Updated the intrinsics list from version 3.4 to 3.6.8. Added a missing-x86.md file to track progress.
2024-06-29 19:16:48 +02:00
Mathilda
e2d9ac5145
Fix documentation of arguments of function core::arch::x86::_mm_blendv_epi8
2024-06-27 16:37:53 +02:00
Jayesskay
8e3abdc290
Fix _mm256_bsrli_epi128 producing invalid lower lane when IMM8 = 15
2024-06-27 16:03:46 +02:00
daxpedda
4d4fca0ac5
Assign Rust v1.81.0
2024-06-27 15:40:34 +02:00
daxpedda
1dee2095c1
Add unsigned aliases
2024-06-27 15:40:34 +02:00
daxpedda
23ad39c916
Stabilize Wasm relaxed SIMD
2024-06-27 15:40:34 +02:00
Tobias Decking
2fd58a7ac7
Use generic simd for avx512 popcnt
2024-06-23 10:14:32 +02:00
Ralf Jung
90d47e9c71
set asm attributes
2024-06-21 16:59:03 +02:00
Ralf Jung
5982c0838a
fix test_mm512_stream_ps test
2024-06-21 16:59:03 +02:00
Ralf Jung
5c0744a3e5
non-temporal stores: use inline assembly
2024-06-21 16:59:03 +02:00
Tobias Decking
36852a1264
Update avx2.rs
2024-06-21 14:52:17 +02:00
Ralf Jung
5e7dada0eb
addcarryx: use pointers of the right type
2024-06-21 11:59:57 +02:00