Enable f16 and f128 on targets that were fixed in LLVM21
LLVM21 fixed the new float types on a number of targets:
* SystemZ gained f16 support https://github.com/llvm/llvm-project/pull/109164
* Hexagon now uses soft f16 to avoid recursion bugs https://github.com/llvm/llvm-project/pull/130977
* Mips now correctly handles f128 (actually since LLVM20) https://github.com/llvm/llvm-project/pull/117525
* f128 is now correctly aligned when passing the stack on x86 https://github.com/llvm/llvm-project/pull/138092
Thus, enable the types on relevant targets for LLVM > 21.0.0.
NVPTX also gained handling of f128 as a storage type, but it lacks support for basic math operations so is still disabled here.
try-job: dist-i586-gnu-i586-i686-musl
try-job: dist-i686-linux
try-job: dist-i686-msvc
try-job: dist-s390x-linux
try-job: dist-various-1
try-job: dist-various-2
try-job: dist-x86_64-linux
try-job: i686-gnu-1
try-job: i686-gnu-2
try-job: i686-msvc-1
try-job: i686-msvc-2
try-job: test-various
atomicrmw on pointers: move integer-pointer cast hacks into backend
Conceptually, we want to have atomic operations on pointers of the form `fn atomic_add(ptr: *mut T, offset: usize, ...)`. However, LLVM does not directly support such operations (https://github.com/llvm/llvm-project/issues/120837), so we have to cast the `offset` to a pointer somewhere.
This PR moves that hack into the LLVM backend, so that the standard library, intrinsic, and Miri all work with the conceptual operation we actually want. Hopefully, one day LLVM will gain a way to represent these operations without integer-pointer casts, and then the hack will disappear entirely.
Cc ```@nikic``` -- this is the best we can do right now, right?
Fixes https://github.com/rust-lang/rust/issues/134617
coverage: Remove all unstable support for MC/DC instrumentation
Preliminary support for a partial implementation of “Modified Condition/Decision Coverage” instrumentation was added behind the unstable flag `-Zcoverage-options=mcdc` in 2024. These are the most substantial PRs involved:
- rust-lang/rust#123409
- rust-lang/rust#126733
At the time, I accepted these PRs with relatively modest scrutiny, because I did not want to stand in the way of independent work on MC/DC instrumentation. My hope was that ongoing work by interested contributors would lead to the code becoming clearer and more maintainable over time.
---
However, that MC/DC code has proven itself to be a major burden on overall maintenance of coverage instrumentation, and a major obstacle to other planned improvements, such as internal changes needed for proper support of macro expansion regions.
I have also become reluctant to accept any further MC/DC-related changes that would increase this burden.
That tension has resulted in an unhappy impasse. On one hand, the present MC/DC implementation is not yet complete, and shows little sign of being complete at an acceptable level of code quality in the foreseeable future. On the other hand, the continued existence of this partial MC/DC implementation is imposing serious maintenance burdens on every other aspect of coverage instrumentation, and is preventing some of the very improvements that would make it easier to accept expanded MC/DC support in the future.
While I know this will be disappointing to some, I think the healthy way forward is accept that I made the wrong call in accepting the current implementation, and to remove it entirely from the compiler.
The implementation of the linkage attribute inside extern blocks defines
symbols starting with _rust_extern_with_linkage_. If someone tries to
also define this symbol you will get a symbol conflict or even an ICE.
By adding an unpredictable component to the symbol name, this becomes
less of an issue.
Instead of collecting pretty printers transitively when building
executables/staticlibs/cdylibs, let the debugger find each crate's
pretty printers via its .debug_gdb_scripts section. This covers the case
where libraries defining custom pretty printers are loaded dynamically.
Make sure that compiler and linker don't optimize the section's contents
away by adding the global holding the data to "llvm.used". The volatile
load in the main shim is retained because "llvm.used", which translates
to SHF_GNU_RETAIN on ELF targets, requires a reasonably recent linker;
emitting the volatile load ensures compatibility with older linkers, at
least when libstd is used.
Pretty printers in dylib dependencies are now emitted by the main crate
instead of the dylib; apart from matching how rlibs are handled, this
approach has the advantage that `omit_gdb_pretty_printer_section` keeps
working with dylib dependencies.
Disabling loading of pretty printers in the debugger itself is more
reliable. Before this commit the .gdb_debug_scripts section couldn't be
included in dylibs or rlibs as otherwise there is no way to disable the
section anymore without recompiling the entire standard library.
Deduplicate `IntTy`/`UintTy`/`FloatTy`.
There are identical definitions in `rustc_type_ir` and `rustc_ast`. This commit removes them and places a single definition in `rustc_ast_ir`. This requires adding `rust_span` as a dependency of `rustc_ast_ir`, but means a bunch of silly conversion functions can be removed.
r? `@fmease`
coverage: Re-land "Enlarge empty spans during MIR instrumentation"
This allows us to assume that coverage spans will only be discarded during codegen in very unusual situations.
---
This seemingly-simple change has a rather messy history:
- rust-lang/rust#140847
- rust-lang/rust#141650
- rust-lang/rust#144298
- rust-lang/rust#144480
Since then, a number of related changes have landed that should make it reasonable to try again:
- rust-lang/rust#144530
- rust-lang/rust#144560
- rust-lang/rust#144616
In particular, we have multiple fixes/mitigations, and a confirmed regression test for the original bug that is not triggered by re-landing the changes in this PR.
Implement support for `become` and explicit tail call codegen for the LLVM backend
This PR implements codegen of explicit tail calls via `become` in `rustc_codegen_ssa` and support within the LLVM backend. Completes a task on (https://github.com/rust-lang/rust/issues/112788). This PR implements all the necessary bits to make explicit tail calls usable, other backends have received stubs for now and will ICE if you use `become` on them. I suspect there is some bikeshedding to be done on how we should go about implementing this for other backends, but it should be relatively straightforward for GCC after this is merged.
During development I also put together a POC bytecode VM based on tail call dispatch to test these changes out and analyze the codegen to make sure it generates expected assembly. That is available [here](https://github.com/xacrimon/tcvm).
fix(debuginfo): disable overflow check for recursive non-enum types
Commit b10edb4 introduce an overflow check when generating debuginfo for expanding recursive types. While this check works correctly for enums, it can incorrectly prune valid debug information for structures.
For example see rust-lang/rust#143241 (https://github.com/rust-lang/rust/issues/143241#issuecomment-3073721477). Furthermore, for structures such check does not make sense, since structures with recursively expanding types simply will not compile (there is a `hir_analysis_recursive_generic_parameter` for that).
closesrust-lang/rust#143241
Rollup of 7 pull requests
Successful merges:
- rust-lang/rust#144072 (update `Atomic*::from_ptr` and `Atomic*::as_ptr` docs)
- rust-lang/rust#144151 (`tests/ui/issues/`: The Issues Strike Back [1/N])
- rust-lang/rust#144300 (Clippy fixes for miropt-test-tools)
- rust-lang/rust#144399 (Add a ratchet for moving all standard library tests to separate packages)
- rust-lang/rust#144472 (str: Mark unstable `round_char_boundary` feature functions as const)
- rust-lang/rust#144503 (Various refactors to the codegen coordinator code (part 3))
- rust-lang/rust#144530 (coverage: Infer `instances_used` from `pgo_func_name_var_map`)
r? `@ghost`
`@rustbot` modify labels: rollup
coverage: Infer `instances_used` from `pgo_func_name_var_map`
In obscure circumstances involving macro-expanded spans, we would sometimes emit a covfun record for a function with no physical coverage counters, and therefore no corresponding entry in the “PGO names” section of the binary. The absence of that name entry causes `llvm-cov` to fail with the cryptic error message:
```text
malformed instrumentation profile data: function name is empty
```
We can eliminate this mismatch by removing `instances_used` entirely, and instead inferring its contents from the keys of `pgo_func_name_var_map`.
This makes it impossible for a "used" function to lack a PGO name entry.
---
This is an attempt to eliminate the cause of rust-lang/rust#141577 when re-landing changes like rust-lang/rust#144298 in the future.
I haven't been able to reproduce the underlying issue in an in-tree test, because the only known repro involves a non-trivial derive proc-macro that relies on `syn` and `proc-macro2`. But I have manually verified in a separate branch that this change would have prevented the reoccurrence of https://github.com/rust-lang/rust/issues/141577#issuecomment-3120667286.
Various refactors to the codegen coordinator code (part 3)
Continuing from https://github.com/rust-lang/rust/pull/144062 this removes an option without any known users, uses the object crate in favor of LLVM for getting the LTO bitcode and improves the coordinator channel handling.
In obscure circumstances, we would sometimes emit a covfun record for a
function with no physical coverage counters, causing `llvm-cov` to fail with
the cryptic error message:
malformed instrumentation profile data: function name is empty
We can eliminate this mismatch by removing `instances_used` entirely, and
instead inferring its contents from the keys of `pgo_func_name_var_map`.
This makes it impossible for a "used" function to lack a PGO name entry.
Unify LLVM ctlz/cttz intrinsic generation
The type signature for `llvm.ctlz` is the same as the one for `llvm.cttz`, which enables a small simplification.
Nobody seems to actually use this, while still adding some extra
complexity to the already rather complex codegen coordinator code.
It is also not supported by any backend other than the LLVM backend.
musl does not implement the symbols required by std for f128 maths.
Disable the associated cfg for all musl targets and adjust the tests
accordingly.
Signed-off-by: Jens Reidel <adrian@travitia.xyz>
Various refactors to the LTO handling code (part 2)
Continuing from https://github.com/rust-lang/rust/pull/143388 this removes a bit of dead code and moves the LTO symbol export calculation from individual backends to cg_ssa.
gpu offload host code generation
r? ghost
This will generate most of the host side code to use llvm's offload feature.
The first PR will only handle automatic mem-transfers to and from the device.
So if a user calls a kernel, we will copy inputs back and forth, but we won't do the actual kernel launch.
Before merging, we will use LLVM's Info infrastructure to verify that the memcopies match what openmp offloa generates in C++. `LIBOMPTARGET_INFO=-1 ./my_rust_binary` should print that a memcpy to and later from the device is happening.
A follow-up PR will generate the actual device-side kernel which will then do computations on the GPU.
A third PR will implement manual host2device and device2host functionality, but the goal is to minimize cases where a user has to overwrite our default handling due to performance issues.
I'm trying to get a full MVP out first, so this just recognizes GPU functions based on magic names. The final frontend will obviously move this over to use proper macros, like I'm already doing it for the autodiff work.
This work will also be compatible with std::autodiff, so one can differentiate GPU kernels.
Tracking:
- https://github.com/rust-lang/rust/issues/131513