fix `-Zsanitizer=kcfi` on `#[naked]` functions
fixes https://github.com/rust-lang/rust/issues/143266
With `-Zsanitizer=kcfi`, indirect calls happen via generated intermediate shim that forwards the call. The generated shim preserves the attributes of the original, including `#[unsafe(naked)]`. The shim is not a naked function though, and violates its invariants (like having a body that consists of a single `naked_asm!` call).
My fix here is to match on the `InstanceKind`, and only use `codegen_naked_asm` when the instance is not a `ReifyShim`. That does beg the question whether there are other `InstanceKind`s that could come up. As far as I can tell the answer is no: calling via `dyn` seems to work find, and `#[track_caller]` is disallowed in combination with `#[naked]`.
r? codegen
````@rustbot```` label +A-naked
cc ````@maurer```` ````@rcvalle````
Various refactors to the LTO handling code
In particular reducing the sharing of code paths between fat and thin-LTO and making the fat LTO implementation more self-contained. This also moves some autodiff handling out of cg_ssa into cg_llvm given that Enzyme only works with LLVM anyway and an implementation for another backend may do things entirely differently. This will also make it a bit easier to split LTO handling out of the coordinator thread main loop into a separate loop, which should reduce the complexity of the coordinator thread.
Allow custom default address spaces and parse `p-` specifications in the datalayout string
Some targets, such as CHERI, use as default an address space different from the "normal" default address space `0` (in the case of CHERI, [200 is used](https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-877.pdf)). Currently, `rustc` does not allow to specify custom address spaces and does not take into consideration [`p-` specifications in the datalayout string](https://llvm.org/docs/LangRef.html#langref-datalayout).
This patch tries to mitigate these problems by allowing targets to define a custom default address space (while keeping the default value to address space `0`) and adding the code to parse the `p-` specifications in `rustc_abi`. The main changes are that `TargetDataLayout` now uses functions to refer to pointer-related informations, instead of having specific fields for the size and alignment of pointers in the default address space; furthermore, the two `pointer_size` and `pointer_align` fields in `TargetDataLayout` are replaced with an `FxHashMap` that holds info for all the possible address spaces, as parsed by the `p-` specifications.
The potential performance drawbacks of not having ad-hoc fields for the default address space will be tested in this PR's CI run.
r? workingjubilee
Most uses of it either contain a fat or thin lto module. Only
WorkItem::LTO could contain both, but splitting that enum variant
doesn't complicate things much.
Rollup of 9 pull requests
Successful merges:
- rust-lang/rust#143019 (Ensure -V --verbose processes both codegen_backend and codegen-backend)
- rust-lang/rust#143140 (give Pointer::into_parts a more scary name and offer a safer alternative)
- rust-lang/rust#143175 (Make combining LLD with external LLVM config a hard error)
- rust-lang/rust#143180 (Use `tracing-forest` instead of `tracing-tree` for bootstrap tracing)
- rust-lang/rust#143223 (Improve macro stats printing)
- rust-lang/rust#143228 (Handle build scripts better in `-Zmacro-stats` output.)
- rust-lang/rust#143229 ([COMPILETEST-UNTANGLE 1/N] Move some some early config checks to the lib and move the compiletest binary)
- rust-lang/rust#143246 (Subtree update of `rust-analyzer`)
- rust-lang/rust#143248 (Update books)
r? `@ghost`
`@rustbot` modify labels: rollup
give Pointer::into_parts a more scary name and offer a safer alternative
`into_parts` is a bit too innocent of a name for a somewhat subtle operation.
r? `@oli-obk`
Add SIMD funnel shift and round-to-even intrinsics
This PR adds 3 new SIMD intrinsics
- `simd_funnel_shl` - funnel shift left
- `simd_funnel_shr` - funnel shift right
- `simd_round_ties_even` (vector version of `round_ties_even_fN`)
TODO (future PR): implement `simd_fsh{l,r}` in miri, cg_gcc and cg_clif (it is surprisingly hard to implement without branches, the common tricks that rotate uses doesn't work because we have 2 elements now. e.g, the `-n&31` trick used by cg_gcc to implement rotate doesn't work with this because then `fshl(a, b, 0)` will be `a | b`)
[#t-compiler > More SIMD intrinsics](https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/More.20SIMD.20intrinsics/with/522130286)
`@rustbot` label T-compiler T-libs A-intrinsics F-core_intrinsics
r? `@workingjubilee`
Change __rust_no_alloc_shim_is_unstable to be a function
This fixes a long sequence of issues:
1. A customer reported that building for Arm64EC was broken: #138541
2. This was caused by a bug in my original implementation of Arm64EC support, namely that only functions on Arm64EC need to be decorated with `#` but Rust was decorating statics as well.
3. Once I corrected Rust to only decorate functions, I started linking failures where the linker couldn't find statics exported by dylib dependencies. This was caused by the compiler not marking exported statics in the generated DEF file with `DATA`, thus they were being exported as functions not data.
4. Once I corrected the way that the DEF files were being emitted, the linker started failing saying that it couldn't find `__rust_no_alloc_shim_is_unstable`. This is because the MSVC linker requires the declarations of statics imported from other dylibs to be marked with `dllimport` (whereas it will happily link to functions imported from other dylibs whether they are marked `dllimport` or not).
5. I then made a change to ensure that `__rust_no_alloc_shim_is_unstable` was marked as `dllimport`, but the MSVC linker started emitting warnings that `__rust_no_alloc_shim_is_unstable` was marked as `dllimport` but was declared in an obj file. This is a harmless warning which is a performance hint: anything that's marked `dllimport` must be indirected via an `__imp` symbol so I added a linker arg in the target to suppress the warning.
6. A customer then reported a similar warning when using `lld-link` (<https://github.com/rust-lang/rust/pull/140176#issuecomment-2872448443>). I don't think it was an implementation difference between the two linkers but rather that, depending on the obj that the declaration versus uses of `__rust_no_alloc_shim_is_unstable` landed in we would get different warnings, so I suppressed that warning as well: #140954.
7. Another customer reported that they weren't using the Rust compiler to invoke the linker, thus these warnings were breaking their build: <https://github.com/rust-lang/rust/pull/140176#issuecomment-2881867433>. At that point, my original change was reverted (#141024) leaving Arm64EC broken yet again.
Taking a step back, a lot of these linker issues arise from the fact that `__rust_no_alloc_shim_is_unstable` is marked as `extern "Rust"` in the standard library and, therefore, assumed to be a foreign item from a different crate BUT the Rust compiler may choose to generate it either in the current crate, some other crate that will be statically linked in OR some other crate that will by dynamically imported.
Worse yet, it is impossible while building a given crate to know if `__rust_no_alloc_shim_is_unstable` will statically linked or dynamically imported: it might be that one of its dependent crates is the one with an allocator kind set and thus that crate (which is compiled later) will decide depending if it has any dylib dependencies or not to import `__rust_no_alloc_shim_is_unstable` or generate it. Thus, there is no way to know if the declaration of `__rust_no_alloc_shim_is_unstable` should be marked with `dllimport` or not.
There is a simple fix for all this: there is no reason `__rust_no_alloc_shim_is_unstable` must be a static. It needs to be some symbol that must be linked in; thus, it could easily be a function instead. As a function, there is no need to mark it as `dllimport` when dynamically imported which avoids the entire mess above.
There may be a perf hit for changing the `volatile load` to be a `tail call`, so I'm happy to change that part back (although I question what the codegen of a `volatile load` would look like, and if the backend is going to try to use load-acquire semantics).
Build with this change applied BEFORE #140176 was reverted to demonstrate that there are no linking issues with either MSVC or MinGW: <https://github.com/rust-lang/rust/actions/runs/15078657205>
Incidentally, I fixed `tests/run-make/no-alloc-shim` to work with MSVC as I needed it to be able to test locally (FYI for #128602)
r? `@bjorn3`
cc `@jieyouxu`
Sized Hierarchy: Part I
This patch implements the non-const parts of rust-lang/rfcs#3729. It introduces two new traits to the standard library, `MetaSized` and `PointeeSized`. See the RFC for the rationale behind these traits and to discuss whether this change makes sense in the abstract.
These traits are unstable (as is their constness), so users cannot refer to them without opting-in to `feature(sized_hierarchy)`. These traits are not behind `cfg`s as this would make implementation unfeasible, there would simply be too many `cfg`s required to add the necessary bounds everywhere. So, like `Sized`, these traits are automatically implemented by the compiler.
RFC 3729 describes changes which are necessary to preserve backwards compatibility given the introduction of these traits, which are implemented and as follows:
- `?Sized` is rewritten as `MetaSized`
- `MetaSized` is added as a default supertrait for all traits w/out an explicit sizedness supertrait already.
There are no edition migrations implemented in this, as these are primarily required for the constness parts of the RFC and prior to stabilisation of this (and so will come in follow-up PRs alongside the const parts). All diagnostic output should remain the same (showing `?Sized` even if the compiler sees `MetaSized`) unless the `sized_hierarchy` feature is enabled.
Due to the use of unstable extern types in the standard library and rustc, some bounds in both projects have had to be relaxed already - this is unfortunate but unavoidable so that these extern types can continue to be used where they were before. Performing these relaxations in the standard library and rustc are desirable longer-term anyway, but some bounds are not as relaxed as they ideally would be due to the inability to relax `Deref::Target` (this will be investigated separately).
It is hoped that this is implemented such that it could be merged and these traits could exist "under the hood" without that being observable to the user (other than in any performance impact this has on the compiler, etc). Some details might leak through due to the standard library relaxations, but this has not been observed in test output.
**Notes:**
- Any commits starting with "upstream:" can be ignored, as these correspond to other upstream PRs that this is based on which have yet to be merged.
- This best reviewed commit-by-commit. I've attempted to make the implementation easy to follow and keep similar changes and test output updates together.
- Each commit has a short description describing its purpose.
- This patch is large but it's primarily in the test suite.
- I've worked on the performance of this patch and a few optimisations are implemented so that the performance impact is neutral-to-minor.
- `PointeeSized` is a different name from the RFC just to make it more obvious that it is different from `std::ptr::Pointee` but all the names are yet to be bikeshed anyway.
- `@nikomatsakis` has confirmed [that this can proceed as an experiment from the t-lang side](https://rust-lang.zulipchat.com/#narrow/channel/435869-project-goals/topic/SVE.20and.20SME.20on.20AArch64.20.28goals.23270.29/near/506196491)
- FCP in https://github.com/rust-lang/rust/pull/137944#issuecomment-2912207485Fixesrust-lang/rust#79409.
r? `@ghost` (I'll discuss this with relevant teams to find a reviewer)
Move metadata object generation for dylibs to the linker code
This deduplicates some code between codegen backends and may in the future allow adding extra metadata that is only known at link time.
Prerequisite of https://github.com/rust-lang/rust/issues/96708.
Simplify implementation of Rust intrinsics by using type parameters in the cache
The current implementation of intrinsics have a lot of duplication to handle different overloads of overloaded LLVM intrinsic. This PR uses the **base name and the type parameters** in the cache instead of the full, overloaded name. This has the benefit that `call_intrinsic` doesn't need to provide the full name, rather the type parameters (which is most of the time more available). This uses `LLVMIntrinsicCopyOverloadedName2` to get the overloaded name from the base name and the type parameters, and only uses it to declare the function.
(originally was part of rust-lang/rust#140763, split off later)
`@rustbot` label A-codegen A-LLVM
r? codegen
Unimplement unsized_locals
Implements https://github.com/rust-lang/compiler-team/issues/630
Tracking issue here: https://github.com/rust-lang/rust/issues/111942
Note that this just removes the feature, not the implementation, and does not touch `unsized_fn_params`. This is because it is required to support `Box<dyn FnOnce()>: FnOnce()`.
There may be more that should be removed (possibly in follow up prs)
- the `forget_unsized` function and `forget` intrinsic.
- the `unsized_locals` test directory; I've just fixed up the tests for now
- various codegen support for unsized values and allocas
cc ``@JakobDegen`` ``@oli-obk`` ``@Noratrieb`` ``@programmerjake`` ``@bjorn3``
``@rustbot`` label F-unsized_locals
Fixesrust-lang/rust#79409