FCW for repr(C) enums whose discriminant values do not fit into a c_int or c_uint
Context: https://github.com/rust-lang/rust/issues/124403
The current behavior of repr(C) enums is as follows:
- The discriminant values are interpreted as const expressions of type `isize`
- We compute the smallest size that can hold all discriminant values
- The target spec contains the smallest size for repr(C) enums
- We take the larger of these two sizes
Unfortunately, this doesn't always match what C compilers do. In particular, MSVC seems to *always* give enums a size of 4 bytes, whereas the algorithm above will give enums a size of up to 8 bytes on 64bit targets. Here's an example enum affected by this:
```
// We give this size 4 on 32bit targets (with a warning since the discriminant is wrapped to fit an isize)
// and size 8 on 64bit targets.
#[repr(C)]
enum OverflowingEnum {
A = 9223372036854775807, // i64::MAX
}
// MSVC always gives this size 4 (without any warning).
// GCC always gives it size 8 (without any warning).
// Godbolt: https://godbolt.org/z/P49MaYvMd
enum overflowing_enum {
OVERFLOWING_ENUM_A = 9223372036854775807,
};
```
If we look at the C standard, then up until C20, there was no official support enums without an explicit underlying type and with discriminants that do not fit an `int`. With C23, this has changed: now enums have to grow automatically if there is an integer type that can hold all their discriminants. MSVC does not implement this part of C23.
Furthermore, Rust fundamentally cannot implement this (without major changes)! Enum discriminants work fundamentally different in Rust and C:
- In Rust, every enum has a discriminant type entirely determined by its repr flags, and then the discriminant values must be const expressions of that type. For repr(C), that type is `isize`. So from the outset we interpret 9223372036854775807 as an isize literal and never give it a chance to be stored in a bigger type. If the discriminant is given as a literal without type annotation, it gets wrapped implicitly with a warning; otherwise the user has to write `as isize` explicitly and thus trigger the wrapping. Later, we can then decide to make the *tag* that stores the discriminant smaller than the discriminant type if all discriminant values fit into a smaller type, but those values have allready all been made to fit an `isize` so nothing bigger than `isize` could ever come out of this. That makes the behavior of 32bit GCC impossible for us to match.
- In C, things flow the other way around: every discriminant value has a type determined entirely by its constant expression, and then the type for the enum is determined based on that. IOW, the expression can have *any type* a priori, different variants can even use a different type, and then the compiler is supposed to look at the resulting *values* (presumably as mathematical integers) and find a type that can hold them all. For the example above, 9223372036854775807 is a signed integer, so the compiler looks for the smallest signed type that can hold it, which is `long long`, and then uses that to compute the size of the enum (at least that's what C23 says should happen and GCC does this correctly).
Realistically I think the best we can do is to not attempt to support C23 enums, and to require repr(C) enums to satisfy the C20 requirements: all discriminants must fit into a c_int. So that's what this PR implements, by adding a FCW for enums with discriminants that do not fit into `c_int`. As a slight extension, we do *not* lint enums where all discriminants fit into a `c_uint` (i.e. `unsigned int`): while C20 does (in my reading) not allow this, and C23 does not prescribe the size of such an enum, this seems to behave consistently across compilers (giving the enum the size of an `unsigned int`). IOW, the lint fires whenever our layout algorithm would make the enum larger than an `int`, irrespective of whether we pick a signed or unsigned discriminant. This extension was added because [crater found](https://github.com/rust-lang/rust/pull/147017#issuecomment-3357077199) multiple cases of such enums across the ecosystem.
Note that it is impossible to trigger this FCW on targets where isize and c_int are the same size (i.e., the typical 32bit target): since we interpret discriminant values as isize, by the time we look at them, they have already been wrapped. However, we have an existing lint (overflowing_literals) that should notify people when this kind of wrapping occurs implicitly. Also, 64bit targets are much more common. On the other hand, even on 64bit targets it is possible to fall into the same trap by writing a literal that is so big that it does not fit into isize, gets wrapped (triggering overflowing_literals), and the wrapped value fits into c_int. Furthermore, overflowing_literals is just a lint, so if it occurs in a dependency you won't notice. (Arguably there is also a more general problem here: for literals of type `usize`/`isize`, it is fairly easy to write code that only triggers `overflowing_literals` on 32bit targets, and to never see that lint if one develops on a 64bit target.)
Specifically, the above example triggers the FCW on 64bit targets, but on 32bit targets we get this err-by-default lint instead (which will be hidden if it occurs in a dependency):
```
error: literal out of range for `isize`
--> $DIR/repr-c-big-discriminant1.rs:16:9
|
LL | A = 9223372036854775807,
| ^^^^^^^^^^^^^^^^^^^
|
= note: the literal `9223372036854775807` does not fit into the type `isize` whose range is `-2147483648..=2147483647`
= note: `#[deny(overflowing_literals)]` on by default
```
Also see the tests added by this PR.
This isn't perfect, but so far I don't think I have seen a better option. In https://github.com/rust-lang/rust/pull/146504 I tried adjusting our enum logic to make the size of the example enum above actually match what C compilers do, but that's a massive breaking change since we have to change the expected type of the discriminant expression from `isize` to `i64` or even `i128` -- so that seems like a no-go. To improve the lint we could analyze things on the HIR level and specifically catch "repr(C) enums with discriminants defined as literals that are too big", but that would have to be on top of the lint in this PR I think since we'd still want to also always check the actually evaluated value (which we can't always determined on the HIR level).
Cc `@workingjubilee` `@CAD97`
Extend attribute deduction to determine whether parameters using
indirect pass mode might have their address captured. Similarly to
the deduction of `readonly` attribute this information facilitates
memcpy optimizations.
deduced_param_attrs: check Freeze on monomorphic types.
`deduced_param_attrs` currently checks `Freeze` bound on polymorphic MIR. This pessimizes the deduction, as generic types are not `Freeze` by default.
This moves the check to the ABI adjustment.
Much of the compiler calls functions on Align projected from AbiAlign.
AbiAlign impls Deref to its inner Align, so we can simplify these away.
Also, it will minimize disruption when AbiAlign is removed.
For now, preserve usages that might resolve to PartialOrd or PartialEq,
as those have odd inference.
Add an attribute to check the number of lanes in a SIMD vector after monomorphization
Allows std::simd to drop the `LaneCount<N>: SupportedLaneCount` trait and maintain good error messages.
Also, extends rust-lang/rust#145967 by including spans in layout errors for all ADTs.
r? ``@RalfJung``
cc ``@workingjubilee`` ``@programmerjake``
rename erase_regions to erase_and_anonymize_regions
I find it consistently confusing that `erase_regions` does more than replacing regions with `'erased`. it also makes some code look real goofy to be writing manual folders to erase regions with a comment saying "we cant use erase regions" :> or code that re-calls erase_regions on types with regions already erased just to anonymize all the bound regions.
r? lcnr
idk how i feel about the name being almost twice as long now
This was done in #145740 and #145947. It is causing problems for people
using r-a on anything that uses the rustc-dev rustup package, e.g. Miri,
clippy.
This repository has lots of submodules and subtrees and various
different projects are carved out of pieces of it. It seems like
`[workspace.dependencies]` will just be more trouble than it's worth.
`-Znext-solver`: support non-defining uses in closures
Cleaned up version of rust-lang/rust#139587, finishing the implementation of https://github.com/rust-lang/types-team/issues/129. This does not affect stable. The reasoning for why this is the case is subtle however.
## What does it do
We split `do_mir_borrowck` into `borrowck_collect_region_constraints` and `borrowck_check_region_constraints`, where `borrowck_collect_region_constraints` returns an enormous `CollectRegionConstraintsResult` struct which contains all the relevant data to actually handle opaque type uses and to check the region constraints later on.
`query mir_borrowck` now simply calls `BorrowCheckRootCtxt::do_mir_borrowck` which starts by iterating over all nested bodies of the current function - visiting nested bodies before their parents - and computing their `CollectRegionConstraintsResult`.
After we've collected all constraints it's time to actually compute the concrete types for the opaques defined by this function. With this PR we now compute the concrete types of opaques for each body before using them to check the non-defining uses of any of them.
After we've computed the concrete types by using all bodies, we use `apply_computed_concrete_opaque_types` for each body to constrain non-defining uses, before finally finishing with `borrowck_check_region_constraints`. We always visit nested bodies before their parents when doing this.
## `ClosureRegionRequirements`
As we only call `borrowck_collect_region_constraints` for nested bodies before type checking the parent, we can't simply use the final `ClosureRegionRequirements` of the nested body during MIR type check. We instead track that we need to apply these requirements in `deferred_closure_requirements`.
We now manually apply the final closure requirements to each body after handling opaque types.
This works, except that we may need the region constraints of nested bodies to successfully define an opaque type in the parent. This is handled by using a new `fn compute_closure_requirements_modulo_opaques` which duplicates region checking - while ignoring any errors - before we've added the constraints from `apply_computed_concrete_opaque_types`. This is necessary for a lot of async tests, as pretty much the entire function is inside of an async block while the opaque type gets defined in the parent.
As an performance optimization we only use `fn compute_closure_requirements_modulo_opaques` in case the nested body actually depends on any opaque types. Otherwise we eagerly call `borrowck_check_region_constraints` and apply the final closure region requirements right away.
## Impact on stable code
Handling the opaque type uses in the parent function now only uses the closure requirements *modulo opaques*, while it previously also considered member constraints from nested bodies. `External` regions are never valid choice regions. Also, member constraints will never constrain a member region if it is required to be outlived by an external region, as that fails the upper-bound check. 564ee21912/compiler/rustc_borrowck/src/region_infer/opaque_types/member_constraints.rs (L90-L96)
Member constraints therefore never add constraints for external regions :>
r? `@BoxyUwU`
`&Freeze` parameters are not only `readonly` within the function,
but any captures of the pointer can also only be used for reads.
This can now be encoded using the `captures(address, read_provenance)`
attribute.
This restricts the uses of the unadjusted ABI to LLVM intrinsics. The
Rust ABI works fine for the thread-local shim as it always returns
pointers directly like the backend expects.