ziyizhang-1
50cd4ef0c5
initial commit to enable amx
...
AMX Intrinsics:
amx-tile:
- _tile_loadconfig
- _tile_storeconfig
- _tile_loadd
- _tile_release
- _tile_stored
- _tile_stream_loadd
- _tile_zero
amx-int8:
- _tile_dpbssd
- _tile_dpbsud
- _tile_dpbusd
- _tile_dpbuud
amx-bf16:
- _tile_dpbf16ps
amx-fp16
- _tile_dpfp16ps
amx-complex
- _tile_cmmimfp16ps
- _tile_cmmrlfp16ps
2024-08-03 19:02:09 +01:00
sayantn
4a13560ede
Update Intrinsics List to v3.6.9
...
Add `#[inline]` to avx512ifma intrinsics
Fix the test equality.
Remove the stability attributes in simd types and test functions
2024-07-26 12:20:06 +01:00
sayantn
3cf2b7d74f
AVX512FP16 Part 9: Remaining avx512fp16 and avxneconvert
2024-07-26 12:20:06 +01:00
sayantn
318e9ec7e7
AVX512FP16 Part 8: Convert from f16
2024-07-26 12:20:06 +01:00
sayantn
cea6530177
AVX512FP16 Part 7: Convert to f16
2024-07-26 12:20:06 +01:00
sayantn
734355993e
AVX512FP16 Part 6: Remaining
...
`cmpph`, `fpclass`, reduce, `blend`, `permutex`
2024-07-26 12:20:06 +01:00
sayantn
debe317dcf
AVX512FP16 Part 5: FP-Support
...
`getexp`, `getmant`, `roundscale`, `scalef`, `reduce`
2024-07-26 12:20:06 +01:00
sayantn
c024ef206f
AVX512FP16 Part 4: Math functions
...
Reciprocal, RSqrt, Sqrt, Max, Min
2024-07-26 12:20:06 +01:00
sayantn
b88dfd6c03
AVX512FP16 Part 3: FMA
2024-07-26 12:20:06 +01:00
sayantn
7be9f610e3
AVX512_FP16 Part 2: Complex Multiplication
2024-07-26 12:20:06 +01:00
sayantn
60dfe5f264
AVX512FP16 Part 1
...
Add-Sub-Mul-Div, Load-Store-Move, `comi`, `set`
2024-07-26 12:20:06 +01:00
Tobias Decking
bbb2ba5424
Refactor avx512bw: reduction operations
2024-07-06 12:07:29 +02:00
Tobias Decking
fe0a378499
Refactor avx512bw: mask operations
2024-07-06 12:07:29 +02:00
sayantn
70fbc2e97c
Implemented some missing functions
...
These cannot be linked with LLVM because of the lack of `bfloat16` and `i1` types in Rust. So, inline asm was the only way
2024-07-06 11:00:34 +02:00
sayantn
3de8e86491
Implemented the missing AVX512BF16 intrinsics
2024-07-06 11:00:34 +02:00
sayantn
f22fab559e
Implemented VEX versions
...
Modified stdarch-test to accept VEX versions
2024-07-06 11:00:34 +02:00
sayantn
775dcaabde
Implemented missing gather-scatters
2024-07-06 11:00:34 +02:00
Tobias Decking
fcee4d8b16
Define remaining IFMA intrinsics
2024-06-30 15:47:18 +02:00
Tobias Decking
d1004e0abd
Refactor avx512f: mask operations
2024-06-30 14:55:25 +02:00
Tobias Decking
9f96670b7c
Refactor avx512f: element extraction
2024-06-30 14:55:25 +02:00
Tobias Decking
883cedc230
Refactor avx512f: integers
2024-06-30 14:55:25 +02:00
Tobias Decking
0d9520dfd4
Refactor avx512f: sqrt + rounding fix
2024-06-30 14:55:25 +02:00
sayantn
fa22a9aeda
Add the missing BMI1, SSE2, SSE4.1 and AVX2 intrinsics
2024-06-29 19:16:48 +02:00
sayantn
ad7cf91833
Fixed many intrinsics
...
fixed reduce-add and reduce-mul. and load/store of mask32 and mask64. added preserves-flags to mov asm. fixed the missing list. fixed `_mm_loadu_si64`. Added `assert_instr`
2024-06-29 19:16:48 +02:00
sayantn
d26d3a7481
Update Intrinsics list
...
Updated the intrinsics list from version 3.4 to 3.6.8. Added a missing-x86.md file to track progress.
2024-06-29 19:16:48 +02:00