1075 Commits

Author SHA1 Message Date
Scott McMurray
1bf1eff5cc Move entirely to array-based SIMD
See MCP#621

This tries to make as few changes as possible -- it keeps the `new` functions taking all the parameters, for example.
2024-08-08 23:47:25 +01:00
Sayantan Chakraborty
0c304072bc Remove the dummy function 2024-08-04 12:10:17 +01:00
Sayantan Chakraborty
200905e0e9 Fix _mm_stream_si64 2024-08-03 22:58:47 +01:00
Jonas Fierlings
47068b1a06 Fix markdown list in docs 2024-08-03 19:24:37 +01:00
ziyizhang-1
50cd4ef0c5 initial commit to enable amx
AMX Intrinsics:

amx-tile:
  - _tile_loadconfig
  - _tile_storeconfig
  - _tile_loadd
  - _tile_release
  - _tile_stored
  - _tile_stream_loadd
  - _tile_zero
amx-int8:
  - _tile_dpbssd
  - _tile_dpbsud
  - _tile_dpbusd
  - _tile_dpbuud
amx-bf16:
  - _tile_dpbf16ps
amx-fp16
  - _tile_dpfp16ps
amx-complex
  - _tile_cmmimfp16ps
  - _tile_cmmrlfp16ps
2024-08-03 19:02:09 +01:00
sayantn
4a13560ede Update Intrinsics List to v3.6.9
Add `#[inline]` to avx512ifma intrinsics
Fix the test equality.
Remove the stability attributes in simd types and test functions
2024-07-26 12:20:06 +01:00
sayantn
3cf2b7d74f AVX512FP16 Part 9: Remaining avx512fp16 and avxneconvert 2024-07-26 12:20:06 +01:00
sayantn
318e9ec7e7 AVX512FP16 Part 8: Convert from f16 2024-07-26 12:20:06 +01:00
sayantn
cea6530177 AVX512FP16 Part 7: Convert to f16 2024-07-26 12:20:06 +01:00
sayantn
734355993e AVX512FP16 Part 6: Remaining
`cmpph`, `fpclass`, reduce, `blend`, `permutex`
2024-07-26 12:20:06 +01:00
sayantn
debe317dcf AVX512FP16 Part 5: FP-Support
`getexp`, `getmant`, `roundscale`, `scalef`, `reduce`
2024-07-26 12:20:06 +01:00
sayantn
c024ef206f AVX512FP16 Part 4: Math functions
Reciprocal, RSqrt, Sqrt, Max, Min
2024-07-26 12:20:06 +01:00
sayantn
b88dfd6c03 AVX512FP16 Part 3: FMA 2024-07-26 12:20:06 +01:00
sayantn
7be9f610e3 AVX512_FP16 Part 2: Complex Multiplication 2024-07-26 12:20:06 +01:00
sayantn
60dfe5f264 AVX512FP16 Part 1
Add-Sub-Mul-Div, Load-Store-Move, `comi`, `set`
2024-07-26 12:20:06 +01:00
sayantn
c878b773d5 AVX512FP16 Part 0: Types 2024-07-26 12:20:06 +01:00
daxpedda
a1ad6bf8be Move Wasm's relaxed SIMD to Rust v1.82 2024-07-25 16:38:08 +01:00
sayantn
74f53212a0 Stabilize simd_x86_updates 2024-07-25 16:07:35 +01:00
sayantn
aa84427fd4 Use LLVM intrinsics for masked load/stores, expand-loads and fp-class
Also, remove some redundant sse target-features from avx intrinsics
2024-07-14 20:26:09 +01:00
daxpedda
ba9e8be05e Revert "wasm32: Add simd128 to enabled features for relaxed intrinsics" 2024-07-14 12:00:23 +02:00
sayantn
aa001c3f3e Some small refactorings
Use llvm intrinsics for `vfpclassss` and `vfpclasssd`
Use `simd_insert` for `x86_polyfill`
2024-07-12 18:12:30 +02:00
Alex Crichton
bb2b4293b9 wasm32: Add simd128 to enabled features for relaxed intrinsics
It looks like LLVM requires that `simd128` is active to use these
intrinsics and `relaxed-simd` isn't implicitly enabling them. This is
probably something to fix at the LLVM layer as well but for now enable
both the `simd128` feature as well as the `relaxed-simd` feature to fix
things on our side.
2024-07-11 17:26:52 +02:00
sayantn
1e8a22c374 Fix Documentation 2024-07-08 00:32:43 +02:00
sayantn
1da646fcab Implement missing in SSE4a and TBM
Add `extracti`, `inserti` and `bextri` intrinsics. Refactor TBM into 2 modules
2024-07-07 19:55:04 +02:00
Tobias Decking
7378b35fd0 Use generic simd in wasm intrinsics 2024-07-07 19:21:10 +02:00
Tobias Decking
bbb2ba5424 Refactor avx512bw: reduction operations 2024-07-06 12:07:29 +02:00
Tobias Decking
fe0a378499 Refactor avx512bw: mask operations 2024-07-06 12:07:29 +02:00
Tobias Decking
198a91e5db Refactor avx512bw: integer comparison 2024-07-06 12:07:29 +02:00
Tobias Decking
f1a1ec2921 Refactor avx512bw: max/min 2024-07-06 12:07:29 +02:00
Tobias Decking
9ad2a62245 Refactor avx512bw: saturating arithmetic 2024-07-06 12:07:29 +02:00
Tobias Decking
13063410dd Refactor avx512bw: avg + mulhi + abs 2024-07-06 12:07:29 +02:00
sayantn
268ac7fe92 Add detection for SHA512, SM3 and SM4
Cannot cross-verify with `cupid` because they do not have these features yet.
2024-07-06 11:29:28 +02:00
sayantn
c862e4e487 Added a bf16 type 2024-07-06 11:00:34 +02:00
sayantn
70fbc2e97c Implemented some missing functions
These cannot be linked with LLVM because of the lack of `bfloat16` and `i1` types in Rust. So, inline asm was the only way
2024-07-06 11:00:34 +02:00
sayantn
3de8e86491 Implemented the missing AVX512BF16 intrinsics 2024-07-06 11:00:34 +02:00
sayantn
f22fab559e Implemented VEX versions
Modified stdarch-test to accept VEX versions
2024-07-06 11:00:34 +02:00
sayantn
775dcaabde Implemented missing gather-scatters 2024-07-06 11:00:34 +02:00
sayantn
1c3b3b80c0 Fix the stream intrinsics
They should use a platform-specific address management.
2024-07-06 11:00:34 +02:00
Tobias Decking
1f3264848f Fix incorrect reduction operations in avx512f 2024-07-02 12:19:20 +02:00
sayantn
ed1df99f03 Added support for AMD verification
Added a custom cpuid file for sde, which enables SSE4a, XOP, TBM and VP2INTERSECT. Fixed `xsave` tests
2024-06-30 21:45:56 +02:00
sayantn
fd948ee99d Updates SDE
Updated SDE to v9.33.0
Disabled `assert-instr` in emulated run
2024-06-30 21:45:56 +02:00
Tobias Decking
fcee4d8b16 Define remaining IFMA intrinsics 2024-06-30 15:47:18 +02:00
Tobias Decking
a56cc86a23 Use generic simd for avx512 leading zeros 2024-06-30 15:17:50 +02:00
Tobias Decking
d1004e0abd Refactor avx512f: mask operations 2024-06-30 14:55:25 +02:00
Tobias Decking
9f96670b7c Refactor avx512f: element extraction 2024-06-30 14:55:25 +02:00
Tobias Decking
9a1d758f03 Refactor avx512f: floating point abs 2024-06-30 14:55:25 +02:00
Tobias Decking
2c81a7ae33 Refactor avx512f: zeroing primitives 2024-06-30 14:55:25 +02:00
Tobias Decking
f5219be7ee Refactor avx512f: integer comparison 2024-06-30 14:55:25 +02:00
Tobias Decking
883cedc230 Refactor avx512f: integers 2024-06-30 14:55:25 +02:00
Tobias Decking
0d9520dfd4 Refactor avx512f: sqrt + rounding fix 2024-06-30 14:55:25 +02:00