As experimentation in 115242 has shown looks better than `coldcc`.
And *don't* use a different convention for cold on Windows, because that actually ends up making things worse.
cc tracking issue 97544
Similar to prior support added for the mips430, avr, and x86 targets
this change implements the rough equivalent of clang's
[`__attribute__((interrupt))`][clang-attr] for riscv targets, enabling
e.g.
```rust
static mut CNT: usize = 0;
pub extern "riscv-interrupt-m" fn isr_m() {
unsafe {
CNT += 1;
}
}
```
to produce highly effective assembly like:
```asm
pub extern "riscv-interrupt-m" fn isr_m() {
420003a0: 1141 addi sp,sp,-16
unsafe {
CNT += 1;
420003a2: c62a sw a0,12(sp)
420003a4: c42e sw a1,8(sp)
420003a6: 3fc80537 lui a0,0x3fc80
420003aa: 63c52583 lw a1,1596(a0) # 3fc8063c <_ZN12esp_riscv_rt3CNT17hcec3e3a214887d53E.0>
420003ae: 0585 addi a1,a1,1
420003b0: 62b52e23 sw a1,1596(a0)
}
}
420003b4: 4532 lw a0,12(sp)
420003b6: 45a2 lw a1,8(sp)
420003b8: 0141 addi sp,sp,16
420003ba: 30200073 mret
```
(disassembly via `riscv64-unknown-elf-objdump -C -S --disassemble ./esp32c3-hal/target/riscv32imc-unknown-none-elf/release/examples/gpio_interrupt`)
This outcome is superior to hand-coded interrupt routines which, lacking
visibility into any non-assembly body of the interrupt handler, have to
be very conservative and save the [entire CPU state to the stack
frame][full-frame-save]. By instead asking LLVM to only save the
registers that it uses, we defer the decision to the tool with the best
context: it can more accurately account for the cost of spills if it
knows that every additional register used is already at the cost of an
implicit spill.
At the LLVM level, this is apparently [implemented by] marking every
register as "[callee-save]," matching the semantics of an interrupt
handler nicely (it has to leave the CPU state just as it found it after
its `{m|s}ret`).
This approach is not suitable for every interrupt handler, as it makes
no attempt to e.g. save the state in a user-accessible stack frame. For
a full discussion of those challenges and tradeoffs, please refer to
[the interrupt calling conventions RFC][rfc].
Inside rustc, this implementation differs from prior art because LLVM
does not expose the "all-saved" function flavor as a calling convention
directly, instead preferring to use an attribute that allows for
differentiating between "machine-mode" and "superivsor-mode" interrupts.
Finally, some effort has been made to guide those who may not yet be
aware of the differences between machine-mode and supervisor-mode
interrupts as to why no `riscv-interrupt` calling convention is exposed
through rustc, and similarly for why `riscv-interrupt-u` makes no
appearance (as it would complicate future LLVM upgrades).
[clang-attr]: https://clang.llvm.org/docs/AttributeReference.html#interrupt-risc-v
[full-frame-save]: 9281af2ecf/src/lib.rs (L440-L469)
[implemented by]: b7fb2a3fec/llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp (L61-L67)
[callee-save]: 973f1fe7a8/llvm/lib/Target/RISCV/RISCVCallingConv.td (L30-L37)
[rfc]: https://github.com/rust-lang/rfcs/pull/3246
We've done measurements with Miri and have determined that `noalias` shouldn't
break code. The requirements that allow us to add dereferenceable and align
have been long documented in the standard library documentation.
LLVM can make use of the `noalias` parameter attribute on the parameter to
`drop_in_place` in areas like argument promotion. Because the Rust compiler
fully controls the code for `drop_in_place`, it can soundly deduce parameter
attributes on it. In the case of a value that has a programmer-defined Drop
implementation, we know that the first thing `drop_in_place` will do is pass a
pointer to the object to `Drop::drop`. `Drop::drop` takes `&mut`, so it must be
guaranteed that there are no pointers to the object upon entering that
function. Therefore, it should be safe to mark `noalias` there.
With this patch, we mark `noalias` only when the type is a value with a
programmer-defined Drop implementation. This is possibly overly conservative,
but I thought that proceeding cautiously was best in this instance.
(This is a large commit. The changes to
`compiler/rustc_middle/src/ty/context.rs` are the most important ones.)
The current naming scheme is a mess, with a mix of `_intern_`, `intern_`
and `mk_` prefixes, with little consistency. In particular, in many
cases it's easy to use an iterator interner when a (preferable) slice
interner is available.
The guiding principles of the new naming system:
- No `_intern_` prefixes.
- The `intern_` prefix is for internal operations.
- The `mk_` prefix is for external operations.
- For cases where there is a slice interner and an iterator interner,
the former is `mk_foo` and the latter is `mk_foo_from_iter`.
Also, `slice_interners!` and `direct_interners!` can now be `pub` or
non-`pub`, which helps enforce the internal/external operations
division.
It's not perfect, but I think it's a clear improvement.
The following lists show everything that was renamed.
slice_interners
- const_list
- mk_const_list -> mk_const_list_from_iter
- intern_const_list -> mk_const_list
- substs
- mk_substs -> mk_substs_from_iter
- intern_substs -> mk_substs
- check_substs -> check_and_mk_substs (this is a weird one)
- canonical_var_infos
- intern_canonical_var_infos -> mk_canonical_var_infos
- poly_existential_predicates
- mk_poly_existential_predicates -> mk_poly_existential_predicates_from_iter
- intern_poly_existential_predicates -> mk_poly_existential_predicates
- _intern_poly_existential_predicates -> intern_poly_existential_predicates
- predicates
- mk_predicates -> mk_predicates_from_iter
- intern_predicates -> mk_predicates
- _intern_predicates -> intern_predicates
- projs
- intern_projs -> mk_projs
- place_elems
- mk_place_elems -> mk_place_elems_from_iter
- intern_place_elems -> mk_place_elems
- bound_variable_kinds
- mk_bound_variable_kinds -> mk_bound_variable_kinds_from_iter
- intern_bound_variable_kinds -> mk_bound_variable_kinds
direct_interners
- region
- intern_region (unchanged)
- const
- mk_const_internal -> intern_const
- const_allocation
- intern_const_alloc -> mk_const_alloc
- layout
- intern_layout -> mk_layout
- adt_def
- intern_adt_def -> mk_adt_def_from_data (unusual case, hard to avoid)
- alloc_adt_def(!) -> mk_adt_def
- external_constraints
- intern_external_constraints -> mk_external_constraints
Other
- type_list
- mk_type_list -> mk_type_list_from_iter
- intern_type_list -> mk_type_list
- tup
- mk_tup -> mk_tup_from_iter
- intern_tup -> mk_tup
`InternIteratorElement` is a trait used to intern values produces by
iterators. There are three impls, corresponding to iterators that
produce different types:
- One for `T`, which operates straightforwardly.
- One for `Result<T, E>`, which is fallible, and will fail early with an
error result if any of the iterator elements are errors.
- One for `&'a T`, which clones the items as it iterates.
That last one is bad: it's extremely easy to use it without realizing
that it clones, which goes against Rust's normal "explicit is better"
approach to cloning.
So this commit just removes it. In practice, there weren't many use
sites. For all but one of them `into_iter()` could be used, which avoids
the need for cloning. And for the one remaining case `copied()` is
used.
Rollup of 7 pull requests
Successful merges:
- #106347 (More accurate spans for arg removal suggestion)
- #108057 (Prevent some attributes from being merged with others on reexports)
- #108090 (`if $c:expr { Some($r:expr) } else { None }` =>> `$c.then(|| $r)`)
- #108092 (note issue for feature(packed_bundled_libs))
- #108099 (use chars instead of strings where applicable)
- #108115 (Do not ICE on unmet trait alias bounds)
- #108125 (Add new people to the compiletest review rotation)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Much like there are specialized variants of `mk_ty`. This will enable
some optimization in the next commit.
Also rename the existing `re_error*` functions as `mk_re_error*`, for
consistency.
The code that consumes PointerKind (`adjust_for_rust_scalar` in rustc_ty_utils)
ended up using PointerKind variants to talk about Rust reference types (& and
&mut) anyway, making the old code structure quite confusing: one always had to
keep in mind which PointerKind corresponds to which type. So this changes
PointerKind to directly reflect the type.
This does not change behavior.
...and remove it from `PointeeInfo`, which isn't meant for this.
There are still various places (marked with FIXMEs) that assume all pointers
have the same size and alignment. Fixing this requires parsing non-default
address spaces in the data layout string, which will be done in a followup.
- Eliminates all the `get_context` calls that async lowering created.
- Replace all `Local` `ResumeTy` types with `&mut Context<'_>`.
The `Local`s that have their types replaced are:
- The `resume` argument itself.
- The argument to `get_context`.
- The yielded value of a `yield`.
The `ResumeTy` hides a `&mut Context<'_>` behind an unsafe raw pointer, and the
`get_context` function is being used to convert that back to a `&mut Context<'_>`.
Ideally the async lowering would not use the `ResumeTy`/`get_context` indirection,
but rather directly use `&mut Context<'_>`, however that would currently
lead to higher-kinded lifetime errors.
See <https://github.com/rust-lang/rust/issues/105501>.
The async lowering step and the type / lifetime inference / checking are
still using the `ResumeTy` indirection for the time being, and that indirection
is removed here. After this transform, the generator body only knows about `&mut Context<'_>`.
Previously, it was only put on scalars with range validity invariants
like bool, was uninit was obviously invalid for those.
Since then, we have normatively declared all uninit primitives to be
undefined behavior and can therefore put `noundef` on them.
The remaining concern was the `mem::uninitialized` function, which cause
quite a lot of UB in the older parts of the ecosystem. This function now
doesn't return uninit values anymore, making users of it safe from this
change.
The only real sources of UB where people could encounter uninit
primitives are `MaybeUninit::uninit().assume_init()`, which has always
be clear in the docs about being UB and from heap allocations (like
reading from the spare capacity of a vec. This is hopefully rare enough
to not break anything.
This change was missed when making async generators implement `Future` directly.
It did not cause any problems in codegen so far, as `GeneratorState<(), Output>`
happens to have the same ABI as `Poll<Output>`.
indirect immutable freeze by-value function parameters.
Right now, `rustc` only examines function signatures and the platform ABI when
determining the LLVM attributes to apply to parameters. This results in missed
optimizations, because there are some attributes that can be determined via
analysis of the MIR making up the function body. In particular, `readonly`
could be applied to most indirectly-passed by-value function arguments
(specifically, those that are freeze and are observed not to be mutated), but
it currently is not.
This patch introduces the machinery that allows `rustc` to determine those
attributes. It consists of a query, `deduced_param_attrs`, that, when
evaluated, analyzes the MIR of the function to determine supplementary
attributes. The results of this query for each function are written into the
crate metadata so that the deduced parameter attributes can be applied to
cross-crate functions. In this patch, we simply check the parameter for
mutations to determine whether the `readonly` attribute should be applied to
parameters that are indirect immutable freeze by-value. More attributes could
conceivably be deduced in the future: `nocapture` and `noalias` come to mind.
Adding `readonly` to indirect function parameters where applicable enables some
potential optimizations in LLVM that are discussed in [issue 103103] and [PR
103070] around avoiding stack-to-stack memory copies that appear in functions
like `core::fmt::Write::write_fmt` and `core::panicking::assert_failed`. These
functions pass a large structure unchanged by value to a subfunction that also
doesn't mutate it. Since the structure in this case is passed as an indirect
parameter, it's a pointer from LLVM's perspective. As a result, the
intermediate copy of the structure that our codegen emits could be optimized
away by LLVM's MemCpyOptimizer if it knew that the pointer is `readonly
nocapture noalias` in both the caller and callee. We already pass `nocapture
noalias`, but we're missing `readonly`, as we can't determine whether a
by-value parameter is mutated by examining the signature in Rust. I didn't have
much success with having LLVM infer the `readonly` attribute, even with fat
LTO; it seems that deducing it at the MIR level is necessary.
No large benefits should be expected from this optimization *now*; LLVM needs
some changes (discussed in [PR 103070]) to more aggressively use the `noalias
nocapture readonly` combination in its alias analysis. I have some LLVM patches
for these optimizations and have had them looked over. With all the patches
applied locally, I enabled LLVM to remove all the `memcpy`s from the following
code:
```rust
fn main() {
println!("Hello {}", 3);
}
```
which is a significant codegen improvement over the status quo. I expect that
if this optimization kicks in in multiple places even for such a simple
program, then it will apply to Rust code all over the place.
[issue 103103]: https://github.com/rust-lang/rust/issues/103103
[PR 103070]: https://github.com/rust-lang/rust/pull/103070