fix(debuginfo): disable overflow check for recursive non-enum types
Commit b10edb4 introduce an overflow check when generating debuginfo for expanding recursive types. While this check works correctly for enums, it can incorrectly prune valid debug information for structures.
For example see rust-lang/rust#143241 (https://github.com/rust-lang/rust/issues/143241#issuecomment-3073721477). Furthermore, for structures such check does not make sense, since structures with recursively expanding types simply will not compile (there is a `hir_analysis_recursive_generic_parameter` for that).
closesrust-lang/rust#143241
Rollup of 7 pull requests
Successful merges:
- rust-lang/rust#144072 (update `Atomic*::from_ptr` and `Atomic*::as_ptr` docs)
- rust-lang/rust#144151 (`tests/ui/issues/`: The Issues Strike Back [1/N])
- rust-lang/rust#144300 (Clippy fixes for miropt-test-tools)
- rust-lang/rust#144399 (Add a ratchet for moving all standard library tests to separate packages)
- rust-lang/rust#144472 (str: Mark unstable `round_char_boundary` feature functions as const)
- rust-lang/rust#144503 (Various refactors to the codegen coordinator code (part 3))
- rust-lang/rust#144530 (coverage: Infer `instances_used` from `pgo_func_name_var_map`)
r? `@ghost`
`@rustbot` modify labels: rollup
coverage: Infer `instances_used` from `pgo_func_name_var_map`
In obscure circumstances involving macro-expanded spans, we would sometimes emit a covfun record for a function with no physical coverage counters, and therefore no corresponding entry in the “PGO names” section of the binary. The absence of that name entry causes `llvm-cov` to fail with the cryptic error message:
```text
malformed instrumentation profile data: function name is empty
```
We can eliminate this mismatch by removing `instances_used` entirely, and instead inferring its contents from the keys of `pgo_func_name_var_map`.
This makes it impossible for a "used" function to lack a PGO name entry.
---
This is an attempt to eliminate the cause of rust-lang/rust#141577 when re-landing changes like rust-lang/rust#144298 in the future.
I haven't been able to reproduce the underlying issue in an in-tree test, because the only known repro involves a non-trivial derive proc-macro that relies on `syn` and `proc-macro2`. But I have manually verified in a separate branch that this change would have prevented the reoccurrence of https://github.com/rust-lang/rust/issues/141577#issuecomment-3120667286.
Various refactors to the codegen coordinator code (part 3)
Continuing from https://github.com/rust-lang/rust/pull/144062 this removes an option without any known users, uses the object crate in favor of LLVM for getting the LTO bitcode and improves the coordinator channel handling.
In obscure circumstances, we would sometimes emit a covfun record for a
function with no physical coverage counters, causing `llvm-cov` to fail with
the cryptic error message:
malformed instrumentation profile data: function name is empty
We can eliminate this mismatch by removing `instances_used` entirely, and
instead inferring its contents from the keys of `pgo_func_name_var_map`.
This makes it impossible for a "used" function to lack a PGO name entry.
Unify LLVM ctlz/cttz intrinsic generation
The type signature for `llvm.ctlz` is the same as the one for `llvm.cttz`, which enables a small simplification.
Nobody seems to actually use this, while still adding some extra
complexity to the already rather complex codegen coordinator code.
It is also not supported by any backend other than the LLVM backend.
musl does not implement the symbols required by std for f128 maths.
Disable the associated cfg for all musl targets and adjust the tests
accordingly.
Signed-off-by: Jens Reidel <adrian@travitia.xyz>
Various refactors to the LTO handling code (part 2)
Continuing from https://github.com/rust-lang/rust/pull/143388 this removes a bit of dead code and moves the LTO symbol export calculation from individual backends to cg_ssa.
gpu offload host code generation
r? ghost
This will generate most of the host side code to use llvm's offload feature.
The first PR will only handle automatic mem-transfers to and from the device.
So if a user calls a kernel, we will copy inputs back and forth, but we won't do the actual kernel launch.
Before merging, we will use LLVM's Info infrastructure to verify that the memcopies match what openmp offloa generates in C++. `LIBOMPTARGET_INFO=-1 ./my_rust_binary` should print that a memcpy to and later from the device is happening.
A follow-up PR will generate the actual device-side kernel which will then do computations on the GPU.
A third PR will implement manual host2device and device2host functionality, but the goal is to minimize cases where a user has to overwrite our default handling due to performance issues.
I'm trying to get a full MVP out first, so this just recognizes GPU functions based on magic names. The final frontend will obviously move this over to use proper macros, like I'm already doing it for the autodiff work.
This work will also be compatible with std::autodiff, so one can differentiate GPU kernels.
Tracking:
- https://github.com/rust-lang/rust/issues/131513
Fixes for LLVM 21
This fixes compatibility issues with LLVM 21 without performing the actual upgrade. Split out from https://github.com/rust-lang/rust/pull/143684.
This fixes three issues:
* Updates the AMDGPU data layout for address space 8.
* Makes emit-arity-indicator.rs a no_core test, so it doesn't fail on non-x86 hosts.
* Explicitly sets the exception model for wasm, as this is no longer implied by `-wasm-enable-eh`.
fix `-Zsanitizer=kcfi` on `#[naked]` functions
fixes https://github.com/rust-lang/rust/issues/143266
With `-Zsanitizer=kcfi`, indirect calls happen via generated intermediate shim that forwards the call. The generated shim preserves the attributes of the original, including `#[unsafe(naked)]`. The shim is not a naked function though, and violates its invariants (like having a body that consists of a single `naked_asm!` call).
My fix here is to match on the `InstanceKind`, and only use `codegen_naked_asm` when the instance is not a `ReifyShim`. That does beg the question whether there are other `InstanceKind`s that could come up. As far as I can tell the answer is no: calling via `dyn` seems to work find, and `#[track_caller]` is disallowed in combination with `#[naked]`.
r? codegen
````@rustbot```` label +A-naked
cc ````@maurer```` ````@rcvalle````
Various refactors to the LTO handling code
In particular reducing the sharing of code paths between fat and thin-LTO and making the fat LTO implementation more self-contained. This also moves some autodiff handling out of cg_ssa into cg_llvm given that Enzyme only works with LLVM anyway and an implementation for another backend may do things entirely differently. This will also make it a bit easier to split LTO handling out of the coordinator thread main loop into a separate loop, which should reduce the complexity of the coordinator thread.
Add --compile-time-deps argument for x check
Together with skipping building C++ code in rustc_llvm for check, this reduces the amount of time it takes to do the x check for rust-analyzer analysis from 12m16s to 3m06s when the bootstrap compiler is already downloaded.
Make some "safe" llvm ops actually sound
Noticed while doing other refactorings
it may cause some extra unnecessary allocations, but the current use sites are rare ones anyway