retpoline and retpoline-external-thunk flags (target modifiers) to enable retpoline-related target features
`-Zretpoline` and `-Zretpoline-external-thunk` flags are target modifiers (tracked to be equal in linked crates).
* Enables target features for `-Zretpoline-external-thunk`:
`+retpoline-external-thunk`, `+retpoline-indirect-branches`, `+retpoline-indirect-calls`.
* Enables target features for `-Zretpoline`:
`+retpoline-indirect-branches`, `+retpoline-indirect-calls`.
It corresponds to clang -mretpoline & -mretpoline-external-thunk flags.
Also this PR forbids to specify those target features manually (warning).
Issue: rust-lang/rust#116852
Stabilize keylocker
This PR stabilizes the feature flag `keylocker_x86` (tracking issue rust-lang/rust#134813).
# Public API
The 2 `x86` target features `kl` and `widekl`, and the associated intrinsics in stdarch.
These target features are very specialized, and are only used to signal the presence of the corresponding CPU instruction. They don't have any nontrivial interaction with the ABI (contrary to something like AVX), and serve the only purpose of enabling 11 stdarch intrinsics, all of which have been implemented and propagated to rustc via a stdarch submodule update.
Also, these were added way back in LLVM12, and as the minimum LLVM required for rustc is LLVM19, we are safe in that front too!
# Associated PRs
- rust-lang/rust#134814
- rust-lang/stdarch#1706
- rust-lang/rust#136831 (stdarch submodule update)
- rust-lang/stdarch#1795 (stabilizing the runtime detection and intrinsics)
- rust-lang/rust#141964 (stdarch submodule update for the stabilization of the runtime detection and intrinsics)
As all of the required tasks have been done (adding the target features to rustc, implementing their runtime detection in std_detect and implementing the associated intrinsics in core_arch), these target features can be stabilized now.
cc ````@rust-lang/lang````
cc ````@rust-lang/libs-api```` for the intrinsics and runtime detection
I don't think anyone else worked on this feature, so no one else to ping, maybe cc ````@Amanieu.```` I will send the reference pr soon.
Stabilize `sha512`, `sm3` and `sm4` for x86
This PR stabilizes the feature flag `sha512_sm_x86` (tracking issue rust-lang/rust#126624).
# Public API
The 3 `x86` target features `sha512`, `sm3` and `sm4`, and the associated intrinsics in stdarch.
These target features are very specialized, and are only used to signal the presence of the corresponding CPU instruction. They don't have any nontrivial interaction with the ABI (contrary to something like AVX), and serve the only purpose of enabling 10 stdarch intrinsics, all of which have been implemented and propagated to rustc via a stdarch submodule update.
Also, these were added in LLVM17, and as the minimum LLVM required for rustc is LLVM19, we are safe in that front too!
# Associated PRs
- rust-lang/rust#126704
- rust-lang/stdarch#1592
- rust-lang/stdarch#1790
- rust-lang/rust#140389 (stdarch submodule update)
- rust-lang/stdarch#1796 (stabilizing the runtime detection and intrinsics)
- rust-lang/rust#141964 (stdarch submodule update for the stabilization of the runtime detection and intrinsics)
As all of the required tasks have been done (adding the target features to rustc, implementing their runtime detection in std_detect and implementing the associated intrinsics in core_arch), these target features can be stabilized now.
cc `@rust-lang/lang`
cc `@rust-lang/libs-api` for the intrinsics and runtime detection
I don't think anyone else worked on this feature, so no one else to ping, maybe cc `@Amanieu.` I will send the reference pr soon.
Add the AVX10 target features
Parent #138843
Adds the `avx10_target_feature` feature gate, and `avx10.1` and `avx10.2` target features.
It is confirmed that Intel is dropping AVX10/256 (see [this comment](https://github.com/rust-lang/rust/issues/111137#issuecomment-2795442288)), so this should be safe to implement now.
The LLVM fix for llvm/llvm-project#135394 was merged, and has been backported to LLVM20, and the patch has also been propagated to rustc in #140502
`@rustbot` label O-x86_64 O-x86_32 A-target-feature A-SIMD
Remove `avx512dq` and `avx512vl` implication for `avx512fp16`
According to Intel, `avx512fp16` requires only `avx512bw`, but LLVM also enables `avx512vl` and `avx512dq` when `avx512fp16` is active. This is relic code, and will be fixed in LLVM soon. We should remove this from Rust too asap, especially before the stabilization of AVX512
Related:
- llvm/llvm-project#136209
- #138940
- rust-lang/stdarch#1781
- #111137
``@rustbot`` label O-x86_64 O-x86_32 A-SIMD A-target-feature T-compiler -T-libs
r? ``@Amanieu``
**Update: the LLVM fix has been merged**
cc ``@rust-lang/wg-llvm`` will it be possible to update the rustc llvm version to something after llvm/llvm-project#137450
rustc_target: RISC-V `Zfinx` is incompatible with `{ILP32,LP64}[FD]` ABIs
Because RISC-V Calling Conventions note that:
> This means code targeting the `Zfinx` extension always uses the ILP32, ILP32E or LP64 integer calling-convention only ABIs as there is no dedicated hardware floating-point register file.
`{ILP32,LP64}[FD]` ABIs with hardware floating-point calling conventions are incompatible with the `Zfinx` extension.
This commit adds `"zfinx"` to the incompatible feature list to those ABIs and tests whether trying to add `"zdinx"` (that is analogous to `"zfinx"` but in double-precision) on a LP64D ABI configuration results in an error (it also tests extension implication; `Zdinx` requires `Zfinx` extension).
Links: RISC-V psABI specification version 1.0
<https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/v1.0/riscv-cc.adoc#named-abis>
<https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/tag/v1.0>
Because RISC-V Calling Conventions note that:
> This means code targeting the Zfinx extension always uses the ILP32,
> ILP32E or LP64 integer calling-convention only ABIs as there is no
> dedicated hardware floating-point register file.
{ILP32,LP64}[FD] ABIs with hardware floating-point calling conventions
are incompatible with the "Zfinx" extension.
This commit adds "zfinx" to the incompatible feature list to those ABIs
and tests whether trying to add "zdinx" (that is analogous to "zfinx" but
in double-precision) on a LP64D ABI configuration results in an error
(it also tests extension implication; "Zdinx" requires "Zfinx" extension).
Link: RISC-V psABI specification version 1.0
<https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/tag/v1.0>
This commit adds three ratified unprivileged RISC-V extensions related to
BFloat16 (BF16) handling.
Although that they are far from stabilization due to ABI issues, they are
optional extensions of the RVA23U64 profile (application-class processor
profile) and going to be discoverable from the Linux kernel
(as of version 6.15-rc4).
This commit mainly prepares runtime detection of those extensions.
This commit adds a part of RISC-V extensions that are mandatory part of
the RVA23U64 profile (application-class processor profile) and related to
memory/atomic constraints.
The Zic64b extension constrains the cache line to naturally-aligned 64 bytes
that would make certain memory operations (like zeroing the memory using
the Zicboz extension) easier.
The Zicbom and Zicbop extensions enable managing cache block-based
operations (the Zicbop contains hints that will work as a NOP when this
extension is absent and the Zicbom contains control instructions).
Of which, the Zicbom extension is going to be discoverable from the Linux
kernel (as of the version 6.15-rc4) and this commit prepares for
corresponding stdarch changes.
The Zicc* extensions add certain constraints to "the main memory" (usually
true on the user mode application on the application-class processor but
those extensions make sure such constraints exist).
The Za64rs extension (reservation set -- a primitive memory unit of LR/SC
atomic operations -- is naturally aligned and *at most* 64 bytes) is a
superset of the Za128rs extension (*at most* 128 bytes; note that smaller
the reservation set is, more fine grained control over atomics).
This commit handles this as a feature implication.
rustc_target: Adjust RISC-V feature implication
This commit adjusts feature implication of the RISC-V ISA for better feature detection from the user perspective.
The main rule is:
* If the feature `A` is a functional superset of the feature `B` (`A ⊃ B`),
`A` is to imply `B`, even if this implication is not on the manual.
Such implications (not directly written in the ISA manual) are commented as `A ⊃ B`
which means "`A` is a (functional) superset of `B`".
1. `Zbc` → `Zbkc` (add as a superset)
The `Zbkc` extension is a subset of the `Zbc` extension (`Zbc` minus `clmulr` instruction).
2. `Zkr` → (nothing) (remove dependency to `Zicsr`)
Implication to the `Zicsr` extension is removed because (although nearly harmless), the `Zkr` extension (or the `seed` CSR section) defines its own subset of the `Zicsr` extension (guaranteed to work against the `seed` CSR which needs read/write access).
3. `Zvbb` → `Zvkb` (comment as a superset)
This implication was already there but not denoted as a functional superset. This commit adds the comment.
4. `Zvfh` → `Zvfhmin` (comment as a superset)
This is similar to the case above (`Zvbb` → `Zvkb`).
5. `Zvfh` → `Zve32f` (add implication per the ISA specification)
This dependency is on the ISA manual but was missing (due to the fact that `Zvfh` indirectly implies `Zve32f` on the current implementation through `Zvfh` → `Zvfhmin` which is a functional relation). This commit ensures that this is *also* ISA-compliant in the source code level (there's no functional changes though).
6. `Zvknhb` → `Zvknha` (add as a superset)
The `Zvknhb` extension (SHA-256 / SHA-512) is a functional superset of the `Zvknha` extension (SHA-256 only).
This commit adjusts feature implication of the RISC-V ISA for better
feature detection from the user perspective.
The main rule is:
If the feature A is a functional superset of the feature B (A ⊃ B),
A is to imply B, even if this implication is not on the manual.
Such implications (not directly referred in the ISA manual) are commented
as "A ⊃ B" which means "A is a (functional) superset of B".
1. Zbc → Zbkc (add as a superset)
The Zbkc extension is a subset of the Zbc extension
(Zbc - "clmulr" instruction == Zbkc)
2. Zkr → (nothing) (remove dependency to Zicsr)
Implication to the Zicsr extension is removed because (although nearly
harmless), the Zkr extension (or the "seed" CSR section) defines its own
subset of the Zicsr extension.
3. Zvbb → Zvkb (comment as a superset)
This implication was already there but not denoted as a functional
superset. This commit adds the comment.
4. Zvfh → Zvfhmin (comment as a superset)
This is similar to the case above (Zvbb → Zvkb).
5. Zvfh → Zve32f (add implication per the ISA specification)
This dependency is on the ISA manual but was missing (due to the fact
that Zvfh indirectly implies Zve32f on the current implementation
through Zvfh → Zvfhmin, which is a functional relation).
This commit ensures that this is *also* ISA-compliant in the
source code level (there's no functional changes though).
6. Zvknhb → Zvknha (add as a superset)
The Zvknhb extension (SHA-256 / SHA-512) is a functional superset of
the Zvknha extension (SHA-256 only).
This commit adds unprivileged ratified extensions that are either
dicoverable from the `riscv_hwprobe` syscall of the Linux kernel (as of
version 6.14) plus 1 minus 3 extensions.
Plus 1:
* "B"
This is a combination of "Zba", "Zbb" and "Zbs".
Note:
Although not required by the RISC-V specification, it is convenient to
imply "B" from its three members (will be implemented in LLVM 21/22) but
this is not yet implemented in Rust due to current implication handling.
It still implies three members *from* "B".
Minus 2:
* "Zcf" (target_arch = "riscv32" only)
This is the compression instruction subset corresponding "F".
This is implied from RV32 + "C" + "F" but this complex handling is
not yet supported by Rust's feature handling.
* "Zcd"
This is the compression instruction subset corresponding "D".
This is implied from "C" + "D" but this complex handling is
not yet supported by Rust's feature handling.
* "Supm"
Unlike regular RISC-V extensions, "Supm" and "Sspm" extensions do not
provide any specific architectural features / constraints but requires
*some* mechanisms to control pointer masking for the current mode.
For instance, reported existence of the "Supm" extension in Linux means
that `prctl` system call to control pointer masking is available and
there are alternative ways to detect the existence.
Notes:
* Because this commit adds the "Zca" extension (an integer subset of the
"C" extension), the "C" extension is modified to imply "Zca".
rustc_target: RISC-V: add base `I`-related important extensions
Of ratified RISC-V features defined, this commit adds extensions satisfying following criteria:
* Formerly a part of the `I` extension and splitted thereafter (now ratified as `I` + `Zifencei` + `Zicsr` + `Zicntr` + `Zihpm`) or
* Dicoverable from newer versions of the Linux kernel and implemented as a part of `std_detect`'s feature (`Zihintpause`) and
* Available on LLVM 18.
This is based on [the latest ratified ISA Manuals (version 20240411)](https://lf-riscv.atlassian.net/wiki/spaces/HOME/pages/16154769/RISC-V+Technical+Specifications).
LLVM Definitions:
* [`Zifencei`](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L133-L137)
* [`Zicsr`](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L116-L120)
* [`Zicntr`](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L122-L124)
* [`Zihpm`](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L153-L155)
* [`Zihintpause`](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0/llvm/lib/Target/RISCV/RISCVFeatures.td#L139-L144)
Additional (1):
One of those, `Zicsr`, is a dependency of many other ISA extensions and this commit adds correct dependencies to `Zicsr`.
Additional (2):
In RISC-V, `G` is an abbreviation of following extensions:
* `I`
* `M`
* `A`
* `F`
* `D`
* `Zicsr` (although implied by `F`)
* `Zifencei`
and all RISC-V targets with the `G` abbreviation and targets for Android / VxWorks are updated accordingly.
Note:
Android will require RVA22 (likely RVA22U64) and some more extensions, which is a superset of RV64GC. For VxWorks, all BSPs currently distributed by Wind River are for boards with RV64GC (this commit also updates `riscv32-wrs-vxworks` though).
--------
This is the version 4.
`Ztso` in the original proposal is removed on the PR version 2 due to the minimum LLVM version (non-experimental `Ztso` requires LLVM 19 while minimum LLVM version of Rust is 18). This is not back in PR version 3 and 4 after noticing adding `Ztso` is possible by checking host LLVM version because PR version 3 introduces compiler target changes (and adding more extensions would complicate the problems; sorry `Zihintpause`).
Version 4:
* Fixed some commit messages,
* Added Android / VxWorks targets to imply `G` and
* Added an implication from `Zve32x` to `Zicsr` (which makes all vector extension subsets to imply `Zicsr`)
since #138742 is now merged.
Related:
* #44839
(`riscv_target_feature`)
* #114544
(This PR can be a prerequisite of resolving a part of that tracking issue)
* #138742
(Touches the same place and vector extensions depend on `Zicsr`)
NOT Related but linked:
* #132618
(This PR won't be blocked by this issue since none of those extensions do not change the ABI)
`@rustbot` r? `@Amanieu`
`@rustbot` label +T-compiler +O-riscv +A-target-feature
Add the new `amx` target features and the `movrs` target feature
Adds 5 new `amx` target features included in LLVM20. These are guarded under `x86_amx_intrinsics` (#126622)
- `amx-avx512`
- `amx-fp8`
- `amx-movrs`
- `amx-tf32`
- `amx-transpose`
Adds the `movrs` target feature (from #137976).
`@rustbot` label O-x86_64 O-x86_32 T-compiler A-target-feature
r? `@Amanieu`
Of ratified RISC-V features defined, this commit adds extensions
satisfying following criteria:
* Formerly a part of the "I" extension and splitted thereafter
(now ratified as "I" + "Zifencei" + "Zicsr" + "Zicntr" + "Zihpm") or
* Dicoverable from newer versions of the Linux kernel and implemented
as a part of std_detect's feature ("Zihintpause").
This is based on the latest ratified ISA Manuals (version 20240411).
Additional (1):
One of those, "Zicsr", is a dependency of many other ISA extensions and
this commit adds correct dependencies to "Zicsr".
Additional (2):
In RISC-V, "G" is an abbreviation of following extensions:
* "I"
* "M"
* "A"
* "F"
* "D"
* "Zicsr" (although implied by "F")
* "Zifencei"
and all RISC-V targets with the "G" abbreviation and targets for Android /
VxWorks are updated accordingly.
Note:
Android will require RVA22 (likely RVA22U64) and some more extensions,
which is a superset of RV64GC. For VxWorks, all BSPs currently distributed
by Wind River are for boards with RV64GC (this commit also updates
riscv32-wrs-vxworks though).
rustc_target: Add target features for LoongArch v1.1
This patch adds new target features for LoongArch v1.1:
* div32
* lam-bh
* lamcas
* ld-seq-sa
* scq
The target feature names are, right now, based on the llvm target feature names. These mostly line up well with the names of [Facility Inidications](https://publibfp.dhe.ibm.com/epubs/pdf/a227832d.pdf#page=301) names. The linux kernel uses shorter, more cryptic names. (e.g. "vector" is `vx`). We can deviate from the llvm names, but the CPU vendor (IBM) does not appear to use e.g. `vx` for what they call `vector`.
There are a number of implied target features between the vector facilities (based on the [Facility Inidications](https://publibfp.dhe.ibm.com/epubs/pdf/a227832d.pdf#page=301) table):
- 129 The vector facility for z/Architecture is installed in the z/Architecture architectural mode.
- 134 The vector packed decimal facility is installed in the z/Architecture architectural mode. When bit 134 is one, bit 129 is also one.
- 135 The vector enhancements facility 1 is installed in the z/Architecture architectural mode. When bit 135 is one, bit 129 is also one.
- 148 The vector-enhancements facility 2 is installed in the z/Architecture architectural mode. When bit 148 is one, bits 129 and 135 are also one.
- 152 The vector-packed-decimal-enhancement facility 1 is installed in the z/Architecture architectural mode. When bit 152 is one, bits 129 and 134 are also one.
- 165 The neural-network-processing-assist facility is installed in the z/Architecture architectural mode. When bit 165 is one, bit 129 is also one.
- 192 The vector-packed-decimal-enhancement facility 2 is installed in the z/Architecture architectural mode. When bit 192 is one, bits 129, 134, and 152 are also one.
And then there are a number of facilities without any implied target features
- 45 The distinct-operands, fast-BCR-serialization, high-word, and population-count facilities, the interlocked-access facility 1, and the load/store-oncondition facility 1 are installed in the z/Architecture architectural mode.
- 73 The transactional-execution facility is installed in the z/Architecture architectural mode. Bit 49 is one when bit 73 is one.
- 133 The guarded-storage facility is installed in the z/Architecture architectural mode.
- 150 The enhanced-sort facility is installed in the z/Architecture architectural mode.
- 151 The DEFLATE-conversion facility is installed in the z/Architecture architectural mode.
The added target features are those that have ISA implications, can be queried at runtime, and have LLVM support. LLVM [defines more target features](d49a2d2bc9/llvm/lib/Target/SystemZ/SystemZFeatures.td), but I'm not sure those are useful. They can always be added later, and can already be set globally using `-Ctarget-feature`.