Rollup of 7 pull requests
Successful merges:
- rust-lang/rust#144072 (update `Atomic*::from_ptr` and `Atomic*::as_ptr` docs)
- rust-lang/rust#144151 (`tests/ui/issues/`: The Issues Strike Back [1/N])
- rust-lang/rust#144300 (Clippy fixes for miropt-test-tools)
- rust-lang/rust#144399 (Add a ratchet for moving all standard library tests to separate packages)
- rust-lang/rust#144472 (str: Mark unstable `round_char_boundary` feature functions as const)
- rust-lang/rust#144503 (Various refactors to the codegen coordinator code (part 3))
- rust-lang/rust#144530 (coverage: Infer `instances_used` from `pgo_func_name_var_map`)
r? `@ghost`
`@rustbot` modify labels: rollup
Various refactors to the codegen coordinator code (part 3)
Continuing from https://github.com/rust-lang/rust/pull/144062 this removes an option without any known users, uses the object crate in favor of LLVM for getting the LTO bitcode and improves the coordinator channel handling.
Nobody seems to actually use this, while still adding some extra
complexity to the already rather complex codegen coordinator code.
It is also not supported by any backend other than the LLVM backend.
Use serde for target spec json deserialize
The previous manual parsing of `serde_json::Value` was a lot of complicated code and extremely error-prone. It was full of janky behavior like sometimes ignoring type errors, sometimes erroring for type errors, sometimes warning for type errors, and sometimes just ICEing for type errors (the icing on the top).
Additionally, many of the error messages about allowed values were out of date because they were in a completely different place than the FromStr impls. Overall, the system caused confusion for users.
I also found the old deserialization code annoying to read. Whenever a `key!` invocation was found, one had to first look for the right macro arm, and no go to definition could help.
This PR replaces all this manual parsing with a 2-step process involving serde.
First, the string is parsed into a `TargetSpecJson` struct. This struct is a 1:1 representation of the spec JSON. It already parses all the enums and is very simple to read and write.
Then, the fields from this struct are copied into the actual `Target`. The reason for this two-step process instead of just serializing into a `Target` is because of a few reasons
1. There are a few transformations performed between the two formats
2. The default logic is implemented this way. Otherwise all the default field values would have to be spelled out again, which is suboptimal. With this logic, they fall out naturally, because everything in the json struct is an `Option`.
Overall, the mapping is pretty simple, with the vast majority of fields just doing a 1:1 mapping that is captured by two macros. I have deliberately avoided making the macros generic to keep them simple.
All the `FromStr` impls now have the error message right inside them, which increases the chance of it being up to date. Some "`from_str`" impls were turned into proper `FromStr` impls to support this.
The new code is much less involved, delegating all the JSON parsing logic to serde, without any manual type matching.
This change introduces a few breaking changes for consumers. While it is possible to use this format on stable, it is very much subject to change, so breaking changes are expected. The hope is also that because of the way stricter behavior, breaking changes are easier to deal with, as they come with clearer error messages.
1. Invalid types now always error, everywhere. Previously, they would sometimes error, and sometimes just be ignored (which meant the users JSON was still broken, just silently!)
2. This now makes use of `deny_unknown_fields` instead of just warning on unused fields, which was done previously. Serde doesn't make it easy to get such warning behavior, which was the primary reason that this now changed. But I think error behavior is very reasonable too. If someone has random stale fields in their JSON, it is likely because these fields did something at some point but no longer do, and the user likely wants to be informed of this so they can figure out what to do.
This is also relevant for the future. If we remove a field but someone has it set, it probably makes sense for them to take a look whether they need this and should look for alternatives, or whether they can just delete it. Overall, the JSON is made more explicit.
This is the only expected breakage, but there could also be small breakage from small mistakes. All targets roundtrip though, so it can't be anything too major.
fixesrust-lang/rust#144153
The previous manual parsing of `serde_json::Value` was a lot of
complicated code and extremely error-prone. It was full of janky
behavior like sometimes ignoring type errors, sometimes erroring for
type errors, sometimes warning for type errors, and sometimes just
ICEing for type errors (the icing on the top).
Additionally, many of the error messages about allowed values were out
of date because they were in a completely different place than the
FromStr impls. Overall, the system caused confusion for users.
I also found the old deserialization code annoying to read. Whenever a
`key!` invocation was found, one had to first look for the right macro
arm, and no go to definition could help.
This PR replaces all this manual parsing with a 2-step process involving
serde.
First, the string is parsed into a `TargetSpecJson` struct. This struct
is a 1:1 representation of the spec JSON. It already parses all the
enums and is very simple to read and write.
Then, the fields from this struct are copied into the actual `Target`.
The reason for this two-step process instead of just serializing into a
`Target` is because of a few reasons
1. There are a few transformations performed between the two formats
2. The default logic is implemented this way. Otherwise all the default
field values would have to be spelled out again, which is
suboptimal. With this logic, they fall out naturally, because
everything in the json struct is an `Option`.
Overall, the mapping is pretty simple, with the vast majority of fields
just doing a 1:1 mapping that is captured by two macros. I have
deliberately avoided making the macros generic to keep them simple.
All the `FromStr` impls now have the error message right inside them,
which increases the chance of it being up to date. Some "`from_str`"
impls were turned into proper `FromStr` impls to support this.
The new code is much less involved, delegating all the JSON parsing
logic to serde, without any manual type matching.
This change introduces a few breaking changes for consumers. While it is
possible to use this format on stable, it is very much subject to
change, so breaking changes are expected. The hope is also that because
of the way stricter behavior, breaking changes are easier to deal with,
as they come with clearer error messages.
1. Invalid types now always error, everywhere. Previously, they would
sometimes error, and sometimes just be ignored (which meant the users
JSON was still broken, just silently!)
2. This now makes use of `deny_unknown_fields` instead of just warning
on unused fields, which was done previously. Serde doesn't make it
easy to get such warning behavior, which was the primary reason that
this now changed. But I think error behavior is very reasonable too.
If someone has random stale fields in their JSON, it is likely
because these fields did something at some point but no longer do,
and the user likely wants to be informed of this so they can figure
out what to do.
This is also relevant for the future. If we remove a field but
someone has it set, it probably makes sense for them to take a look
whether they need this and should look for alternatives, or whether
they can just delete it. Overall, the JSON is made more explicit.
This is the only expected breakage, but there could also be small
breakage from small mistakes. All targets roundtrip though, so it can't
be anything too major.
Rollup of 11 pull requests
Successful merges:
- rust-lang/rust#142300 (Disable `tests/run-make/mte-ffi` because no CI runners have MTE extensions enabled)
- rust-lang/rust#143271 (Store the type of each GVN value)
- rust-lang/rust#143293 (fix `-Zsanitizer=kcfi` on `#[naked]` functions)
- rust-lang/rust#143719 (Emit warning when there is no space between `-o` and arg)
- rust-lang/rust#143846 (pass --gc-sections if -Zexport-executable-symbols is enabled and improve tests)
- rust-lang/rust#143891 (Port `#[coverage]` to the new attribute system)
- rust-lang/rust#143967 (constify `Option` methods)
- rust-lang/rust#144008 (Fix false positive double negations with macro invocation)
- rust-lang/rust#144010 (Boostrap: add warning on `optimize = false`)
- rust-lang/rust#144049 (rustc-dev-guide subtree update)
- rust-lang/rust#144056 (Copy GCC sources into the build directory even outside CI)
r? `@ghost`
`@rustbot` modify labels: rollup
Emit warning when there is no space between `-o` and arg
Closesrust-lang/rust#142812
`getopt` doesn't seem to have an API to check this, so we have to check the args manually.
r? compiler
`-Zhigher-ranked-assumptions`: Consider WF of coroutine witness when proving outlives assumptions
### TL;DR
This PR introduces an unstable flag `-Zhigher-ranked-assumptions` which tests out a new algorithm for dealing with some of the higher-ranked outlives problems that come from auto trait bounds on coroutines. See:
* rust-lang/rust#110338
While it doesn't fix all of the issues, it certainly fixed many of them, so I'd like to get this landed so people can test the flag on their own code.
### Background
Consider, for example:
```rust
use std::future::Future;
trait Client {
type Connecting<'a>: Future + Send
where
Self: 'a;
fn connect(&self) -> Self::Connecting<'_>;
}
fn call_connect<C>(c: C) -> impl Future + Send
where
C: Client + Send + Sync,
{
async move { c.connect().await }
}
```
Due to the fact that we erase the lifetimes in a coroutine, we can think of the interior type of the async block as something like: `exists<'r, 's> { C, &'r C, C::Connecting<'s> }`. The first field is the `c` we capture, the second is the auto-ref that we perform on the call to `.connect()`, and the third is the resulting future we're awaiting at the first and only await point. Note that every region is uniquified differently in the interior types.
For the async block to be `Send`, we must prove that both of the interior types are `Send`. First, we have an `exists<'r, 's>` binder, which needs to be instantiated universally since we treat the regions in this binder as *unknown*[^exist]. This gives us two types: `{ &'!r C, C::Connecting<'!s> }`. Proving `&'!r C: Send` is easy due to a [`Send`](https://doc.rust-lang.org/nightly/std/marker/trait.Send.html#impl-Send-for-%26T) impl for references.
Proving `C::Connecting<'!s>: Send` can only be done via the item bound, which then requires `C: '!s` to hold (due to the `where Self: 'a` on the associated type definition). Unfortunately, we don't know that `C: '!s` since we stripped away any relationship between the interior type and the param `C`. This leads to a bogus borrow checker error today!
### Approach
Coroutine interiors are well-formed by virtue of them being borrow-checked, as long as their callers are invoking their parent functions in a well-formed way, then substitutions should also be well-formed. Therefore, in our example above, we should be able to deduce the assumption that `C: '!s` holds from the well-formedness of the interior type `C::Connecting<'!s>`.
This PR introduces the notion of *coroutine assumptions*, which are the outlives assumptions that we can assume hold due to the well-formedness of a coroutine's interior types. These are computed alongside the coroutine types in the `CoroutineWitnessTypes` struct. When we instantiate the binder when proving an auto trait for a coroutine, we instantiate the `CoroutineWitnessTypes` and stash these newly instantiated assumptions in the region storage in the `InferCtxt`. Later on in lexical region resolution or MIR borrowck, we use these registered assumptions to discharge any placeholder outlives obligations that we would otherwise not be able to prove.
### How well does it work?
I've added a ton of tests of different reported situations that users have shared on issues like rust-lang/rust#110338, and an (anecdotally) large number of those examples end up working straight out of the box! Some limitations are described below.
### How badly does it not work?
The behavior today is quite rudimentary, since we currently discharge the placeholder assumptions pretty early in region resolution. This manifests itself as some limitations on the code that we accept.
For example, `tests/ui/async-await/higher-ranked-auto-trait-11.rs` continues to fail. In that test, we must prove that a placeholder is equal to a universal for a param-env candidate to hold when proving an auto trait, e.g. `'!1 = 'a` is required to prove `T: Trait<'!1>` in a param-env that has `T: Trait<'a>`. Unfortunately, at that point in the MIR body, we only know that the placeholder is equal to some body-local existential NLL var `'?2`, which only gets equated to the universal `'a` when being stored into the return local later on in MIR borrowck.
This could be fixed by integrating these assumptions into the type outlives machinery in a more first-class way, and delaying things to the end of MIR typeck when we know the full relationship between existential and universal NLL vars. Doing this integration today is quite difficult today.
`tests/ui/async-await/higher-ranked-auto-trait-11.rs` fails because we don't compute the full transitive outlives relations between placeholders. In that test, we have in our region assumptions that some `'!1 = '!2` and `'!2 = '!3`, but we must prove `'!1 = '!3`.
This can be fixed by computing the set of coroutine outlives assumptions in a more transitive way, or as I mentioned above, integrating these assumptions into the type outlives machinery in a more first-class way, since it's already responsible for the transitive outlives assumptions of universals.
### Moving forward
I'm still quite happy with this implementation, and I'd like to land it for testing. I may work on overhauling both the way we compute these coroutine assumptions and also how we deal with the assumptions during (lexical/nll) region checking. But for now, I'd like to give users a chance to try out this new `-Zhigher-ranked-assumptions` flag to uncover more shortcomings.
[^exist]: Instantiating this binder with infer regions would be incomplete, since we'd be asking for *some* instantiation of the interior types, not proving something for *all* instantiations of the interior types.
This stabilizes a subset of the `-Clink-self-contained` components on x64 linux:
the rust-lld opt-out.
The opt-in is not stabilized, as interactions with other stable flags require
more internal work, but are not needed for stabilizing using rust-lld by default.
Similarly, since we only switch to rust-lld on x64 linux, the opt-out is
only stabilized there. Other targets still require `-Zunstable-options`
to use it.
This stabilizes a subset of the `-Clinker-features` components on x64 linux:
the lld opt-out.
The opt-in is not stabilized, as interactions with other stable flags require
more internal work, but are not needed for stabilizing using rust-lld by default.
Similarly, since we only switch to rust-lld on x64 linux, the opt-out is
only stabilized there. Other targets still require `-Zunstable-options`
to use it.
Allow custom default address spaces and parse `p-` specifications in the datalayout string
Some targets, such as CHERI, use as default an address space different from the "normal" default address space `0` (in the case of CHERI, [200 is used](https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-877.pdf)). Currently, `rustc` does not allow to specify custom address spaces and does not take into consideration [`p-` specifications in the datalayout string](https://llvm.org/docs/LangRef.html#langref-datalayout).
This patch tries to mitigate these problems by allowing targets to define a custom default address space (while keeping the default value to address space `0`) and adding the code to parse the `p-` specifications in `rustc_abi`. The main changes are that `TargetDataLayout` now uses functions to refer to pointer-related informations, instead of having specific fields for the size and alignment of pointers in the default address space; furthermore, the two `pointer_size` and `pointer_align` fields in `TargetDataLayout` are replaced with an `FxHashMap` that holds info for all the possible address spaces, as parsed by the `p-` specifications.
The potential performance drawbacks of not having ad-hoc fields for the default address space will be tested in this PR's CI run.
r? workingjubilee
There is one comment at a call site and one comment in the function
definition that are mostly saying the same thing. Fold the call site
comment into the function definition comment to reduce duplication.
There are actually some inaccuracies in the comments but let's
deduplicate before we address the inaccuracies.
Make __rust_alloc_error_handler_should_panic a function
Fixesrust-lang/rust#143253
`__rust_alloc_error_handler_should_panic` is a static but was being exported as a function.
For most targets this doesn't matter, but Arm64EC Windows uses different decorations for exported variables vs functions, hence it fails to link when `-Z oom=abort` is enabled.
We've had issues in the past with statics like this (see rust-lang/rust#141061) but the tldr; is that Arm64EC needs symbols correctly exported as either a function or data, and data MUST and MUST ONLY be marked `dllimport` when the symbol is being imported from another binary, which is non-trivial to calculate for these compiler-generated statics.
So, instead, the easiest thing to do is to make `__rust_alloc_error_handler_should_panic` a function instead.
Since `__rust_alloc_error_handler_should_panic` isn't involved in any linking shenanigans, I've marked it as `AlwaysInline` with the hopes that the various backends will see that it is just returning a constant and perform the same optimizations as the previous implementation.
r? `@bjorn3`
Add PrintTAFn flag for targeted type analysis printing
## Summary
This PR adds a new `PrintTAFn` flag to the `-Z autodiff` option that allows printing type analysis information for a specific function, rather than all functions.
## Changes
### New Flag
- Added `PrintTAFn=<function_name>` option to `-Z autodiff`
- Usage: `-Z autodiff=Enable,PrintTAFn=my_function_name`
### Implementation Details
- **Rust side**: Added `PrintTAFn(String)` variant to `AutoDiff` enum
- **Parser**: Updated `parse_autodiff` to handle `PrintTAFn=<function_name>` syntax with proper error handling
- **FFI**: Added `set_print_type_fun` function to interface with Enzyme's `FunctionToAnalyze` command line option
- **Documentation**: Updated help text and documentation for the new flag
### Files Modified
- `compiler/rustc_session/src/config.rs`: Added `PrintTAFn(String)` variant
- `compiler/rustc_session/src/options.rs`: Updated parser and help text (now shows `PrintTAFn` in the list)
- `compiler/rustc_codegen_llvm/src/llvm/enzyme_ffi.rs`: Added FFI function and static variable
- `compiler/rustc_codegen_llvm/src/back/lto.rs`: Added handling for new flag
- `src/doc/rustc-dev-guide/src/autodiff/flags.md`: Updated documentation
- `src/doc/unstable-book/src/compiler-flags/autodiff.md`: Updated documentation
## Testing
The flag can be tested with:
```bash
rustc +enzyme -Z autodiff=Enable,PrintTAFn=square test.rs
```
This will print type analysis information only for the function named "square" instead of all functions.
## Error Handling
The parser includes proper error handling:
- Missing argument: `PrintTAFn` without `=<function_name>` will show an error
- Unknown options: Invalid autodiff options will be reported
r? ```@ZuseZ4```
Refactor Translator
My main motivation was to simplify the usage of `SilentEmitter` for users like rustfmt. A few refactoring opportunities arose along the way.
* Replace `Translate` trait with `Translator` struct
* Replace `Emitter: Translate` with `Emitter::translator`
* Split `SilentEmitter` into `FatalOnlyEmitter` and `SilentEmitter`
Rollup of 6 pull requests
Successful merges:
- rust-lang/rust#135656 (Add `-Z hint-mostly-unused` to tell rustc that most of a crate will go unused)
- rust-lang/rust#138237 (Get rid of `EscapeDebugInner`.)
- rust-lang/rust#141614 (lint direct use of rustc_type_ir )
- rust-lang/rust#142123 (Implement initial support for timing sections (`--json=timings`))
- rust-lang/rust#142377 (Try unremapping compiler sources)
- rust-lang/rust#142674 (remove duplicate crash test)
r? `@ghost`
`@rustbot` modify labels: rollup
Implement initial support for timing sections (`--json=timings`)
This PR implements initial support for emitting high-level compilation section timings. The idea is to provide a very lightweight way of emitting durations of various compilation sections (frontend, backend, linker, or on a more granular level macro expansion, typeck, borrowck, etc.). The ultimate goal is to stabilize this output (in some form), make Cargo pass `--json=timings` and then display this information in the HTML output of `cargo build --timings`, to make it easier to quickly profile "what takes so long" during the compilation of a Cargo project. I would personally also like if Cargo printed some of this information in the interactive `cargo build` output, but the `build --timings` use-case is the main one.
Now, this information is already available with several other sources, but I don't think that we can just use them as they are, which is why I proposed a new way of outputting this data (`--json=timings`):
- This data is available under `-Zself-profile`, but that is very expensive and forever unstable. It's just a too big of a hammer to tell us the duration it took to run the linker.
- It could also be extracted with `-Ztime-passes`. That is pretty much "for free" in terms of performance, and it can be emitted in a structured form to JSON via `-Ztime-passes-format=json`. I guess that one alternative might be to stabilize this flag in some form, but that form might just be `--json=timings`? I guess what we could do in theory is take the already emitted time passes and reuse them for `--json=timings`. Happy to hear suggestions!
I'm sending this PR mostly for a vibeck, to see if the way I implemented it is passable. There are some things to figure out:
- How do we represent the sections? Originally I wanted to output `{ section, duration }`, but then I realized that it might be more useful to actually emit `start` and `end` events. Both because it enables to see the output incrementally (in case compilation takes a long time and you read the outputs directly, or Cargo decides to show this data in `cargo build` some day in the future), and because it makes it simpler to represent hierarchy (see below). The timestamps currently emit microseconds elapsed from a predetermined point in time (~start of rustc), but otherwise they are fully opaque, and should be only ever used to calculate the duration using `end - start`. We could also precompute the duration for the user in the `end` event, but that would require doing more work in rustc, which I would ideally like to avoid :P
- Do we want to have some form of hierarchy? I think that it would be nice to show some more granular sections rather than just frontend/backend/linker (e.g. macro expansion, typeck and borrowck as a part of the frontend). But for that we would need some way of representing hierarchy. A simple way would be something like `{ parent: "frontend" }`, but I realized that with start/end timestamps we get the hierarchy "for free", only the client will need to reconstruct it from the order of start/end events (e.g. `start A`, `start B` means that `B` is a child of `A`).
- What exactly do we want to stabilize? This is probably a question for later. I think that we should definitely stabilize the format of the emitted JSON objects, and *maybe* some specific section names (but we should also make it clear that they can be missing, e.g. you don't link everytime you invoke `rustc`).
The PR be tested e.g. with `rustc +stage1 src/main.rs --json=timings --error-format=json -Zunstable-options` on a crate without dependencies (it is not easy to use `--json` with stock Cargo, because it also passes this flag to `rustc`, so this will later need Cargo integration to be usable with it).
Zulip discussions: [#t-compiler > Outputting time spent in various compiler sections](https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/Outputting.20time.20spent.20in.20various.20compiler.20sections/with/518850162)
MCP: https://github.com/rust-lang/compiler-team/issues/873
r? ``@nnethercote``
Add `-Z hint-mostly-unused` to tell rustc that most of a crate will go unused
This hint allows the compiler to optimize its operation based on this assumption, in order to compile faster. This is a hint, and does not guarantee any particular behavior.
This option can substantially speed up compilation if applied to a large dependency where the majority of the dependency does not get used. This flag may slow down compilation in other cases.
Currently, this option makes the compiler defer as much code generation as possible from functions in the crate, until later crates invoke those functions. Functions that never get invoked will never have code generated for them. For instance, if a crate provides thousands of functions, but only a few of them will get called, this flag will result in the compiler only doing code generation for the called functions. (This uses the same mechanisms as cross-crate inlining of functions.) This does not affect `extern` functions, or functions marked as `#[inline(never)]`.
This option has already existed in nightly as `-Zcross-crate-inline-threshold=always` for some time, and has gotten testing in that form. However, this option is still unstable, to give an opportunity for wider testing in this form.
Some performance numbers, based on a crate with many dependencies having just *one* large dependency set to `-Z hint-mostly-unused` (using Cargo's `profile-rustflags` option):
A release build went from 4m07s to 2m04s.
A non-release build went from 2m26s to 1m28s.
Move metadata object generation for dylibs to the linker code
This deduplicates some code between codegen backends and may in the future allow adding extra metadata that is only known at link time.
Prerequisite of https://github.com/rust-lang/rust/issues/96708.