Implement `needs_async_drop` in rustc and optimize async drop glue
This PR expands on #121801 and implements `Ty::needs_async_drop` which works almost exactly the same as `Ty::needs_drop`, which is needed for #123948.
Also made compiler's async drop code to look more like compiler's regular drop code, which enabled me to write an optimization where types which do not use `AsyncDrop` can simply forward async drop glue to `drop_in_place`. This made size of the async block from the [async_drop test](67980dd6fb/tests/ui/async-await/async-drop.rs) to decrease by 12%.
It's a macro that just creates an enum with a `from_u32` method. It has
two arms. One is unused and the other has a single use.
This commit inlines that single use and removes the whole macro. This
increases readability because we don't have two different macros
interacting (`enum_from_u32` and `language_item_table`).
weak lang items are not allowed to be #[track_caller]
For instance the panic handler will be called via this import
```rust
extern "Rust" {
#[lang = "panic_impl"]
fn panic_impl(pi: &PanicInfo<'_>) -> !;
}
```
A `#[track_caller]` would add an extra argument and thus make this the wrong signature.
The 2nd commit is a consistency rename; based on the docs [here](https://doc.rust-lang.org/unstable-book/language-features/lang-items.html) and [here](https://rustc-dev-guide.rust-lang.org/lang-items.html) I figured "lang item" is more widely used. (In the compiler output, "lang item" and "language item" seem to be pretty even.)
Add `Ord::cmp` for primitives as a `BinOp` in MIR
Update: most of this OP was written months ago. See https://github.com/rust-lang/rust/pull/118310#issuecomment-2016940014 below for where we got to recently that made it ready for review.
---
There are dozens of reasonable ways to implement `Ord::cmp` for integers using comparison, bit-ops, and branches. Those differences are irrelevant at the rust level, however, so we can make things better by adding `BinOp::Cmp` at the MIR level:
1. Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest.
2. By not inlining those details for every use of `cmp`, we drastically reduce the amount of MIR generated for `derive`d `PartialOrd`, while also making it more amenable to MIR-level optimizations.
Having extremely careful `if` ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic. Similarly, using subtraction [tricks](https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign) (#105840) is arguably even nicer, but depends on the optimizer understanding it (https://github.com/llvm/llvm-project/issues/73417) to be practical. Or maybe [bitor is better than add](https://discourse.llvm.org/t/representing-in-ir/67369/2?u=scottmcm)? But maybe only on a future version that [has `or disjoint` support](https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036?u=scottmcm)? And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers. Not to mention that if LLVM gets a spaceship intrinsic -- [which it should](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Suboptimal.20inlining.20in.20std.20function.20.60binary_search.60/near/404250586) -- we'll need at least a rustc intrinsic to be able to call it.
As for simplifying it in Rust, we now regularly inline `{integer}::partial_cmp`, but it's quite a large amount of IR. The best way to see that is with 8811efa88b (diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113) -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks. Even better, as soon as the construct-`Some`-then-match-it-in-same-BB noise is cleaned up, this'll expose the `Cmp == 0` branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a `BinOp::Eq` and thus fix some of our generated code perf issues. (Tracking that through today's `if a < b { Less } else if a == b { Equal } else { Greater }` would be *much* harder.)
---
r? `@ghost`
But first I should check that perf is ok with this
~~...and my true nemesis, tidy.~~
Codegen const panic messages as function calls
This skips emitting extra arguments at every callsite (of which there
can be many). For a librustc_driver build with overflow checks enabled,
this cuts 0.7MB from the resulting shared library (see [perf]).
A sample improvement from nightly:
```
leaq str.0(%rip), %rdi
leaq .Lalloc_d6aeb8e2aa19de39a7f0e861c998af13(%rip), %rdx
movl $25, %esi
callq *_ZN4core9panicking5panic17h17cabb89c5bcc999E@GOTPCREL(%rip)
```
to this PR:
```
leaq .Lalloc_d6aeb8e2aa19de39a7f0e861c998af13(%rip), %rdi
callq *_RNvNtNtCsduqIKoij8JB_4core9panicking11panic_const23panic_const_div_by_zero@GOTPCREL(%rip)
```
[perf]: https://perf.rust-lang.org/compare.html?start=a7e4de13c1785819f4d61da41f6704ed69d5f203&end=64fbb4f0b2d621ff46d559d1e9f5ad89a8d7789b&stat=instructions:u
This skips emitting extra arguments at every callsite (of which there
can be many). For a librustc_driver build with overflow checks enabled,
this cuts 0.7MB from the resulting binary.
remove StructuralEq trait
The documentation given for the trait is outdated: *all* function pointers implement `PartialEq` and `Eq` these days. So the `StructuralEq` trait doesn't really seem to have any reason to exist any more.
One side-effect of this PR is that we allow matching on some consts that do not implement `Eq`. However, we already allowed matching on floats and consts containing floats, so this is not new, it is just allowed in more cases now. IMO it makes no sense at all to allow float matching but also sometimes require an `Eq` instance. If we want to require `Eq` we should adjust https://github.com/rust-lang/rust/pull/115893 to check for `Eq`, and rule out float matching for good.
Fixes https://github.com/rust-lang/rust/issues/115881
Add `AsyncFn` family of traits
I'm proposing to add a new family of `async`hronous `Fn`-like traits to the standard library for experimentation purposes.
## Why do we need new traits?
On the user side, it is useful to be able to express `AsyncFn` trait bounds natively via the parenthesized sugar syntax, i.e. `x: impl AsyncFn(&str) -> String` when experimenting with async-closure code.
This also does not preclude `AsyncFn` becoming something else like a trait alias if a more fundamental desugaring (which can take many[^1] different[^2] forms) comes around. I think we should be able to play around with `AsyncFn` well before that, though.
I'm also not proposing stabilization of these trait names any time soon (we may even want to instead express them via new syntax, like `async Fn() -> ..`), but I also don't think we need to introduce an obtuse bikeshedding name, since `AsyncFn` just makes sense.
## The lending problem: why not add a more fundamental primitive of `LendingFn`/`LendingFnMut`?
Firstly, for `async` closures to be as flexible as possible, they must be allowed to return futures which borrow from the async closure's captures. This can be done by introducing `LendingFn`/`LendingFnMut` traits, or (equivalently) by adding a new generic associated type to `FnMut` which allows the return type to capture lifetimes from the `&mut self` argument of the trait. This was proposed in one of [Niko's blog posts](https://smallcultfollowing.com/babysteps/blog/2023/05/09/giving-lending-and-async-closures/).
Upon further experimentation, for the purposes of closure type- and borrow-checking, I've come to the conclusion that it's significantly harder to teach the compiler how to handle *general* lending closures which may borrow from their captures. This is, because unlike `Fn`/`FnMut`, the `LendingFn`/`LendingFnMut` traits don't form a simple "inheritance" hierarchy whose top trait is `FnOnce`.
```mermaid
flowchart LR
Fn
FnMut
FnOnce
LendingFn
LendingFnMut
Fn -- isa --> FnMut
FnMut -- isa --> FnOnce
LendingFn -- isa --> LendingFnMut
Fn -- isa --> LendingFn
FnMut -- isa --> LendingFnMut
```
For example:
```
fn main() {
let s = String::from("hello, world");
let f = move || &s;
let x = f(); // This borrows `f` for some lifetime `'1` and returns `&'1 String`.
```
That trait hierarchy means that in general for "lending" closures, like `f` above, there's not really a meaningful return type for `<typeof(f) as FnOnce>::Output` -- it can't return `&'static str`, for example.
### Special-casing this problem:
By splitting out these traits manually, and making sure that each trait has its own associated future type, we side-step the issue of having to answer the questions of a general `LendingFn`/`LendingFnMut` implementation, since the compiler knows how to generate built-in implementations for first-class constructs like async closures, including the required future types for the (by-move) `AsyncFnOnce` and (by-ref) `AsyncFnMut`/`AsyncFn` trait implementations.
[^1]: For example, with trait transformers, we may eventually be able to write: `trait AsyncFn = async Fn;`
[^2]: For example, via the introduction of a more fundamental "`LendingFn`" trait, plus a [special desugaring with augmented trait aliases](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Lending.20closures.20and.20Fn*.28.29.20-.3E.20impl.20Trait/near/408471480).
Remove `box_free` lang item
This PR removes the `box_free` lang item, replacing it with `Box`'s `Drop` impl. Box dropping is still slightly magic because the contained value is still dropped by the compiler.
Add cross-language LLVM CFI support to the Rust compiler
This PR adds cross-language LLVM Control Flow Integrity (CFI) support to the Rust compiler by adding the `-Zsanitizer-cfi-normalize-integers` option to be used with Clang `-fsanitize-cfi-icall-normalize-integers` for normalizing integer types (see https://reviews.llvm.org/D139395).
It provides forward-edge control flow protection for C or C++ and Rust -compiled code "mixed binaries" (i.e., for when C or C++ and Rust -compiled code share the same virtual address space). For more information about LLVM CFI and cross-language LLVM CFI support for the Rust compiler, see design document in the tracking issue #89653.
Cross-language LLVM CFI can be enabled with -Zsanitizer=cfi and -Zsanitizer-cfi-normalize-integers, and requires proper (i.e., non-rustc) LTO (i.e., -Clinker-plugin-lto).
Thank you again, ``@bjorn3,`` ``@nikic,`` ``@samitolvanen,`` and the Rust community for all the help!
This commit adds cross-language LLVM Control Flow Integrity (CFI)
support to the Rust compiler by adding the
`-Zsanitizer-cfi-normalize-integers` option to be used with Clang
`-fsanitize-cfi-icall-normalize-integers` for normalizing integer types
(see https://reviews.llvm.org/D139395).
It provides forward-edge control flow protection for C or C++ and Rust
-compiled code "mixed binaries" (i.e., for when C or C++ and Rust
-compiled code share the same virtual address space). For more
information about LLVM CFI and cross-language LLVM CFI support for the
Rust compiler, see design document in the tracking issue #89653.
Cross-language LLVM CFI can be enabled with -Zsanitizer=cfi and
-Zsanitizer-cfi-normalize-integers, and requires proper (i.e.,
non-rustc) LTO (i.e., -Clinker-plugin-lto).