Port `#[rustc_layout_scalar_valid_range_start/end]` to the new attrib…
Ports `rustc_layout_scalar_valid_range_start` and `rustc_layout_scalar_valid_range_end` to the new attribute parsing infrastructure for https://github.com/rust-lang/rust/issues/131229#issuecomment-2971353197
r? `@jdonszelmann`
inherit `#[align]` from trait method prototypes
````@workingjubilee```` this seems straightforward enough. Now that we're planning to make `-Cmin-function-alignment` a target modifier, I don't think there are any cross-crate complications here?
````@Jules-Bertholet```` is this the behavior you had in mind? In particular the inheritance of the attribute of a default impl is maybe a bit unintuitive at first? (but I think it's ok if that behavior is explicitly documented).
r? ghost
Introduce `ByteSymbol`
It's like `Symbol` but for byte strings. The interner is now used for both `Symbol` and `ByteSymbol`. E.g. if you intern `"dog"` and `b"dog"` you'll get a `Symbol` and a `ByteSymbol` with the same index and the characters will only be stored once.
The motivation for this is to eliminate the `Arc`s in `ast::LitKind`, to make `ast::LitKind` impl `Copy`, and to avoid the need to arena-allocate `ast::LitKind` in HIR. The latter change reduces peak memory by a non-trivial amount on literal-heavy benchmarks such as `deep-vector` and `tuple-stress`.
`Encoder`, `Decoder`, `SpanEncoder`, and `SpanDecoder` all get some changes so that they can handle normal strings and byte strings.
give Pointer::into_parts a more scary name and offer a safer alternative
`into_parts` is a bit too innocent of a name for a somewhat subtle operation.
r? `@oli-obk`
It's like `Symbol` but for byte strings. The interner is now used for
both `Symbol` and `ByteSymbol`. E.g. if you intern `"dog"` and `b"dog"`
you'll get a `Symbol` and a `ByteSymbol` with the same index and the
characters will only be stored once.
The motivation for this is to eliminate the `Arc`s in `ast::LitKind`, to
make `ast::LitKind` impl `Copy`, and to avoid the need to arena-allocate
`ast::LitKind` in HIR. The latter change reduces peak memory by a
non-trivial amount on literal-heavy benchmarks such as `deep-vector` and
`tuple-stress`.
`Encoder`, `Decoder`, `SpanEncoder`, and `SpanDecoder` all get some
changes so that they can handle normal strings and byte strings.
This change does slow down compilation of programs that use
`include_bytes!` on large files, because the contents of those files are
now interned (hashed). This makes `include_bytes!` more similar to
`include_str!`, though `include_bytes!` contents still aren't escaped,
and hashing is still much cheaper than escaping.
Remove unused feature gates
After finding some unused feature gates in rust-lang/rust#143155 , I wrote a small script to see if I can find any others.
And I did. Not a lot, but still a small win 😁
Contains a few instances of `iter_from_coroutine` that can be removed due to rust-lang/rust#142801 (I guess).
Improve documentation of `TagEncoding`
This PR is follow-up from the [discussion here](https://rust-lang.zulipchat.com/#narrow/channel/182449-t-compiler.2Fhelp/topic/.E2.9C.94.20VariantId.3DDiscriminant.20when.20tag.20is.20niche.20encoded.3F/with/524384295).
It aims at making the `TagEncoding` documentation less ambiguous and more detailed with references to relevant implementation sides. It especially clears up the ambiguous use of discriminant/variant index, which sparked the discussion referenced above.
PS: While working with layout data, I somehow ended up looking at the docs for `FakeBorrowKind` and noticed that the one example was not in a doc comment. I hope that this is minor enough of a fix for it to be okay in this otherwise unrelated PR.
Only compute recursive callees once.
Inlining MIR in a cyclic call graph may create query cycles, which are ICEs. The current implementation `mir_callgraph_reachable(inlining_candidate, being_optimized)` checks if calling `inlining_candidate` may cycle back to `being_optimized` that we are currently inlining into.
This PR replaces this device with query `mir_callgraph_cyclic(being_optimized)` which searches the call graph for all cycles going back to `being_optimized`, and returns the set of functions involved in those cycles.
This is a tradeoff:
- in the current implementation, we perform more walks, but shallower;
- in this new implementation, we perform fewer walks, but exhaust the graph.
I'd have liked to compute this using some kind of SCC, but generic parameters make resolution path-dependent, so usual graph algorithms do not apply.
Insert checks for enum discriminants when debug assertions are enabled
Similar to the existing null-pointer and alignment checks, this checks for valid enum discriminants on creation of enums through unsafe transmutes. Essentially this sanitizes patterns like the following:
```rust
let val: MyEnum = unsafe { std::mem::transmute<u32, MyEnum>(42) };
```
An extension of this check will be done in a follow-up that explicitly sanitizes for extern enum values that come into Rust from e.g. C/C++.
This check is similar to Miri's capabilities of checking for valid construction of enum values.
This PR is inspired by saethlin@'s PR
https://github.com/rust-lang/rust/pull/104862. Thank you so much for keeping this code up and the detailed comments!
I also pair-programmed large parts of this together with vabr-g@.
r? `@saethlin`
New const traits syntax
This PR only affects the AST and doesn't actually change anything semantically.
All occurrences of `~const` outside of libcore have been replaced by `[const]`. Within libcore we have to wait for rustfmt to be bumped in the bootstrap compiler. This will happen "automatically" (when rustfmt is run) during the bootstrap bump, as rustfmt converts `~const` into `[const]`. After this we can remove the `~const` support from the parser
Caveat discovered during impl: there is no legacy bare trait object recovery for `[const] Trait` as that snippet in type position goes down the slice /array parsing code and will error
r? ``@fee1-dead``
cc ``@nikomatsakis`` ``@traviscross`` ``@compiler-errors``
Similar to the existing nullpointer and alignment checks, this checks
for valid enum discriminants on creation of enums through unsafe
transmutes. Essentially this sanitizes patterns like the following:
```rust
let val: MyEnum = unsafe { std::mem::transmute<u32, MyEnum>(42) };
```
An extension of this check will be done in a follow-up that explicitly
sanitizes for extern enum values that come into Rust from e.g. C/C++.
This check is similar to Miri's capabilities of checking for valid
construction of enum values.
This PR is inspired by saethlin@'s PR
https://github.com/rust-lang/rust/pull/104862. Thank you so much for
keeping this code up and the detailed comments!
I also pair-programmed large parts of this together with vabr-g@.
This centralizes the placeholder type error reporting in one location, but it also exposes the granularity at which we convert things from hir to ty more. E.g. previously infer types in where bounds were errored together with the function signature, but now they are independent.
Add note to `find_const_ty_from_env`
Add a note to `find_const_ty_from_env` to explain why it has an `unwrap` which "often" causes ICEs.
Also, uplift it into the new trait solver. This avoids needing to go through the interner to call this method which is otherwise an inherent method in the compiler. I can remove this part if desired.
r? `@boxyuwu`
make `tidy-alphabetical` use a natural sort
The idea here is that these lines should be correctly sorted, even though a naive string comparison would say they are not:
```
foo2
foo10
```
This is the ["natural sort order"](https://en.wikipedia.org/wiki/Natural_sort_order).
There is more discussion in [#t-compiler/help > tidy natural sort](https://rust-lang.zulipchat.com/#narrow/channel/182449-t-compiler.2Fhelp/topic/tidy.20natural.20sort/with/519111079)
Unfortunately, no standard sorting tools are smart enough to to this automatically (casting some doubt on whether we should make this change). Here are some sort outputs:
```
> cat foo.txt | sort
foo
foo1
foo10
foo2
mp
mp1e2
np",
np1e2",
> cat foo.txt | sort -n
foo
foo1
foo10
foo2
mp
mp1e2
np",
np1e2",
> cat foo.txt | sort -V
foo
foo1
foo2
foo10
mp
mp1e2
np1e2",
np",
```
Disappointingly, "numeric" sort does not actually have the behavior we want. It only sorts by numeric value if the line starts with a number. The "version" sort looks promising, but does something very unintuitive if you look at the final 4 values. None of the other options seem to have the desired behavior in all cases:
```
-b, --ignore-leading-blanks ignore leading blanks
-d, --dictionary-order consider only blanks and alphanumeric characters
-f, --ignore-case fold lower case to upper case characters
-g, --general-numeric-sort compare according to general numerical value
-i, --ignore-nonprinting consider only printable characters
-M, --month-sort compare (unknown) < 'JAN' < ... < 'DEC'
-h, --human-numeric-sort compare human readable numbers (e.g., 2K 1G)
-n, --numeric-sort compare according to string numerical value
-R, --random-sort shuffle, but group identical keys. See shuf(1)
--random-source=FILE get random bytes from FILE
-r, --reverse reverse the result of comparisons
--sort=WORD sort according to WORD:
general-numeric -g, human-numeric -h, month -M,
numeric -n, random -R, version -V
-V, --version-sort natural sort of (version) numbers within text
```
r? ```@Noratrieb``` (it sounded like you know this code?)
Withdraw the claim `extern "C-cmse-nonsecure-*"` always matches `extern "C"`
We currently claim that `extern "C-cmse-nonsecure-*"` ABIs will always match `extern "C"`, but that seems... **optimistic** when one considers that `extern "C"` is ambiguous enough to be redefined in ways we may not want the Cortex M Security Extensions ABIs to mirror. If some configuration, feature, or other platform quirk that applied to Arm CPUs with CMSE would modify the `extern "C"` ABI, it does not seem like we should guarantee that also applies to the `extern "cmse-nonsecure-*"` ABIs. Anything involving target modifiers that might affect register availability or usage could make us liars if, for instance, clang decides those apply to normal C functions but not ones with the CMSE attributes, but we still want to have interop with the C compiler.
We simply do not control enough of the factors involved to both force these ABIs to match and still provide useful interop, so we shouldn't implicitly promise they do. We should leave this judgement call to the decisions of platform experts who can afford to keep up with the latest news from Cambridge, instead of enshrining today's hopeful guess forever in Rust's permitted ABIs.
It's a bit weird anyways.
- The attributes are `__attribute__((cmse_nonsecure_call))` and `__attribute__((cmse_nonsecure_entry))`, so the obvious choice is `extern "cmse-nonsecure-call"` and `extern "cmse-nonsecure-entry"`.
- We do not prefix any other ABI that reflects (or even *is*) a C ABI with "C-", with the exception of the Rust-defined `extern "C-unwind`", e.g. we do not have `extern "C-aapcs"` or `extern "C-sysv64"`.
Tracking issues:
- rust-lang/rust#75835
- rust-lang/rust#81391
Taint body on invalid call ABI
Fixes https://github.com/rust-lang/rust/issues/142969
I'm not certain if there are any other paths that should be tainted, but they would operate similarly. Perhaps pointer coercion.
Introduces `extern "rust-invalid"` for testing purposes.
r? ```@workingjubilee``` or ```@oli-obk``` (or anyone)
Add `#[loop_match]` for improved DFA codegen
tracking issue: https://github.com/rust-lang/rust/issues/132306
project goal: https://github.com/rust-lang/rust-project-goals/issues/258
This PR adds the `#[loop_match]` attribute, which aims to improve code generation for state machines. For some (very exciting) benchmarks, see https://github.com/rust-lang/rust-project-goals/issues/258#issuecomment-2732965199
Currently, a very restricted syntax pattern is accepted. We'd like to get feedback and merge this now before we go too far in a direction that others have concerns with.
## current state
We accept code that looks like this
```rust
#[loop_match]
loop {
state = 'blk: {
match state {
State::A => {
#[const_continue]
break 'blk State::B
}
State::B => { /* ... */ }
/* ... */
}
}
}
```
- a loop should have the same semantics with and without `#[loop_match]`: normal `continue` and `break` continue to work
- `#[const_continue]` is only allowed in loops annotated with `#[loop_match]`
- the loop body needs to have this particular shape (a single assignment to the match scrutinee, with the body a labelled block containing just a match)
## future work
- perform const evaluation on the `break` value
- support more state/scrutinee types
## maybe future work
- allow `continue 'label value` syntax, which `#[const_continue]` could then use.
- allow the match to be on an arbitrary expression (e.g. `State::Initial`)
- attempt to also optimize `break`/`continue` expressions that are not marked with `#[const_continue]`
r? ``@traviscross``
rustc_std_internal_symbol is meant to call functions from crates where
there is no direct dependency on said crate. As they either have to be
added to symbols.o or rustc has to introduce an implicit dependency on
them to avoid linker errors. The latter is done for some things like the
panic runtime, but adding these symbols to symbols.o allows removing
those implicit dependencies.
Merge unboxed trait object error suggestion into regular dyn incompat error
Another hir-walker removed from the well-formed queries. This error was always a duplicate of another, but it was able to provide more information because it could invoke `is_dyn_compatible` without worrying about cycle errors. That's also the reason we can't put the error directly into hir_ty_lowering when lowering a `dyn Trait` within an associated item signature. So instead I packed it into the error handling of wf obligation checking.