This recovers some instruction count regressions after changing most Pat
parsing functions return a non-boxed Pat. It seems to be beneficial to
break out a #[cold] parsing recovery function when the function includes
more parsing, presumably because this requires more stack space and/or
mem copies when LLVM optimizes the wrong path.
```
error: unexpected `...`
--> $DIR/varargs-in-closure-isnt-supported.rs:5:20
|
LL | let mut lol = |...| ();
| ^^^ not a valid pattern
|
= note: C-variadic type `...` is not allowed here
```
Include whitespace in "remove |" suggestion and make it hidden
Tweak error rendering of patterns with an extra `|` on either end.
Built on #137409. Only last commit is relevant.
? ``@compiler-errors``
By replacing them with `{Open,Close}{Param,Brace,Bracket,Invisible}`.
PR #137902 made `ast::TokenKind` more like `lexer::TokenKind` by
replacing the compound `BinOp{,Eq}(BinOpToken)` variants with fieldless
variants `Plus`, `Minus`, `Star`, etc. This commit does a similar thing
with delimiters. It also makes `ast::TokenKind` more similar to
`parser::TokenType`.
This requires a few new methods:
- `TokenKind::is_{,open_,close_}delim()` replace various kinds of
pattern matches.
- `Delimiter::as_{open,close}_token_kind` are used to convert
`Delimiter` values to `TokenKind`.
Despite these additions, it's a net reduction in lines of code. This is
because e.g. `token::OpenParen` is so much shorter than
`token::OpenDelim(Delimiter::Parenthesis)` that many multi-line forms
reduce to single line forms. And many places where the number of lines
doesn't change are still easier to read, just because the names are
shorter, e.g.:
```
- } else if self.token != token::CloseDelim(Delimiter::Brace) {
+ } else if self.token != token::CloseBrace {
```
Notes about tests:
- tests/ui/rfcs/rfc-2294-if-let-guard/feature-gate.rs: some messages are
now duplicated due to repeated parsing.
- tests/ui/rfcs/rfc-2497-if-let-chains/disallowed-positions.rs: ditto.
- `tests/ui/proc-macro/macro-rules-derive-cfg.rs`: the diff looks large
but the only difference is the insertion of a single
invisible-delimited group around a metavar.
- `tests/ui/attributes/nonterminal-expansion.rs`: a slight span
degradation, somehow related to the recent massive attr parsing
rewrite (#135726). I couldn't work out exactly what is going wrong,
but I don't think it's worth holding things up for a single slightly
suboptimal error message.
For consistency with `rustc_lexer::TokenKind::Bang`, and because other
`ast::TokenKind` variants generally have syntactic names instead of
semantic names (e.g. `Star` and `DotDot` instead of `Mul` and `Range`).
`BinOpToken` is badly named, because it only covers the assignable
binary ops and excludes comparisons and `&&`/`||`. Its use in
`ast::TokenKind` does allow a small amount of code sharing, but it's a
clumsy factoring.
This commit removes `ast::TokenKind::BinOp{,Eq}`, replacing each one
with 10 individual variants. This makes `ast::TokenKind` more similar to
`rustc_lexer::TokenKind`, which has individual variants for all
operators.
Although the number of lines of code increases, the number of chars
decreases due to the frequent use of shorter names like `token::Plus`
instead of `token::BinOp(BinOpToken::Plus)`.
The one notable test change is `tests/ui/macros/trace_faulty_macros.rs`.
This commit removes the complicated `Interpolated` handling in
`expected_expression_found` that results in a longer error message. But
I think the new, shorter message is actually an improvement.
The original complaint was in #71039, when the error message started
with "error: expected expression, found `1 + 1`". That was confusing
because `1 + 1` is an expression. Other than that, the reporter said
"the whole error message is not too bad if you ignore the first line".
Subsequently, extra complexity and wording was added to the error
message. But I don't think the extra wording actually helps all that
much. In particular, it still says of the `1+1` that "this is expected
to be expression". This repeats the problem from the original complaint!
This commit removes the extra complexity, reverting to a simpler error
message. This is primarily because the traversal is a pain without
`Interpolated` tokens. Nonetheless, I think the error message is
*improved*. It now starts with "expected expression, found `pat`
metavariable", which is much clearer and the real problem. It also
doesn't say anything specific about `1+1`, which is good, because the
`1+1` isn't really relevant to the error -- it's the `$e:pat` that's
important.
This removes support for attributes on struct field rest patterns (the `..`) from the parser.
Previously they were being parsed but dropped from the AST, so didn't work and were deleted by rustfmt.
The parser pushes a `TokenType` to `Parser::expected_token_types` on
every call to the various `check`/`eat` methods, and clears it on every
call to `bump`. Some of those `TokenType` values are full tokens that
require cloning and dropping. This is a *lot* of work for something
that is only used in error messages and it accounts for a significant
fraction of parsing execution time.
This commit overhauls `TokenType` so that `Parser::expected_token_types`
can be implemented as a bitset. This requires changing `TokenType` to a
C-style parameterless enum, and adding `TokenTypeSet` which uses a
`u128` for the bits. (The new `TokenType` has 105 variants.)
The new types `ExpTokenPair` and `ExpKeywordPair` are now arguments to
the `check`/`eat` methods. This is for maximum speed. The elements in
the pairs are always statically known; e.g. a
`token::BinOp(token::Star)` is always paired with a `TokenType::Star`.
So we now compute `TokenType`s in advance and pass them in to
`check`/`eat` rather than the current approach of constructing them on
insertion into `expected_token_types`.
Values of these pair types can be produced by the new `exp!` macro,
which is used at every `check`/`eat` call site. The macro is for
convenience, allowing any pair to be generated from a single identifier.
The ident/keyword filtering in `expected_one_of_not_found` is no longer
necessary. It was there to account for some sloppiness in
`TokenKind`/`TokenType` comparisons.
The existing `TokenType` is moved to a new file `token_type.rs`, and all
its new infrastructure is added to that file. There is more boilerplate
code than I would like, but I can't see how to make it shorter.
Use field init shorthand where possible
Field init shorthand allows writing initializers like `tcx: tcx` as
`tcx`. The compiler already uses it extensively. Fix the last few places
where it isn't yet used.
EDIT: this PR also updates `rustfmt.toml` to set
`use_field_init_shorthand = true`.
`rustc_span::symbol` defines some things that are re-exported from
`rustc_span`, such as `Symbol` and `sym`. But it doesn't re-export some
closely related things such as `Ident` and `kw`. So you can do `use
rustc_span::{Symbol, sym}` but you have to do `use
rustc_span::symbol::{Ident, kw}`, which is inconsistent for no good
reason.
This commit re-exports `Ident`, `kw`, and `MacroRulesNormalizedIdent`,
and changes many `rustc_span::symbol::` qualifiers in `compiler/` to
`rustc_span::`. This is a 200+ net line of code reduction, mostly
because many files with two `use rustc_span` items can be reduced to
one.
Field init shorthand allows writing initializers like `tcx: tcx` as
`tcx`. The compiler already uses it extensively. Fix the last few places
where it isn't yet used.
When we recover from a pattern parse error, or a pattern uses `..`, we keep track of that and affect resolution error for missing bindings that could have been provided by that pattern. We differentiate between `..` and parse recovery. We silence resolution errors likely caused by the pattern parse error.
```
error[E0425]: cannot find value `title` in this scope
--> $DIR/struct-pattern-with-missing-fields-resolve-error.rs:19:30
|
LL | println!("[{}]({})", title, url);
| ^^^^^ not found in this scope
|
note: `Website` has a field `title` which could have been included in this pattern, but it wasn't
--> $DIR/struct-pattern-with-missing-fields-resolve-error.rs:17:12
|
LL | / struct Website {
LL | | url: String,
LL | | title: Option<String> ,
| | ----- defined here
LL | | }
| |_-
...
LL | if let Website { url, .. } = website {
| ^^^^^^^^^^^^^^^^^^^ this pattern doesn't include `title`, which is available in `Website`
```
Fix#74863.
Inline ExprPrecedence::order into Expr::precedence
The representation of expression precedence in rustc_ast has been an obstacle to further improvements in the pretty-printer (continuing from #119105 and #119427).
Previously the operation of *"does this expression have lower precedence than that one"* (relevant for parenthesis insertion in macro-generated syntax trees) consisted of 3 steps:
1. Convert `Expr` to `ExprPrecedence` using `.precedence()`
2. Convert `ExprPrecedence` to `i8` using `.order()`
3. Compare using `<`
As far as I can guess, the reason for the separation between `precedence()` and `order()` was so that both `rustc_ast::Expr` and `rustc_hir::Expr` could convert as straightforwardly as possible to the same `ExprPrecedence` enum, and then the more finicky logic performed by `order` could be present just once.
The mapping between `Expr` and `ExprPrecedence` was intended to be as straightforward as possible:
```rust
match self.kind {
ExprKind::Closure(..) => ExprPrecedence::Closure,
...
}
```
although there were exceptions of both many-to-one, and one-to-many:
```rust
ExprKind::Underscore => ExprPrecedence::Path,
ExprKind::Path(..) => ExprPrecedence::Path,
...
ExprKind::Match(_, _, MatchKind::Prefix) => ExprPrecedence::Match,
ExprKind::Match(_, _, MatchKind::Postfix) => ExprPrecedence::PostfixMatch,
```
Where the nature of `ExprPrecedence` becomes problematic is when a single expression kind might be associated with multiple different precedence levels depending on context (outside the expression) and contents (inside the expression). For example consider what is the precedence of an ExprKind::Closure `$closure`. Well, on the left-hand side of a binary operator it would need parentheses in order to avoid the trailing binary operator being absorbed into the closure body: `($closure) + Rhs`, so the precedence is something lower than that of `+`. But on the right-hand side of a binary operator, a closure is just a straightforward prefix expression like a unary op, which is a relatively high precedence level, higher than binops but lower than method calls: `Lhs + $closure` is fine without parens but `($closure).method()` needs them. But as a third case, if the closure contains an explicit return type, then the precedence is an even higher level than that, never needing parenthesization even in a binop left-hand side or method call: `|| -> bool { false } + Rhs` or `|| -> bool { false }.method()`.
You can see that trying to capture all of this resolution about expressions into `ExprPrecedence` violates the intention of `ExprPrecedence` being a straightforward one-to-one correspondence from each AST and HIR `ExprKind` variant. It would be possible to attempt that by doing stuff like `ExprPrecedence::Closure(Side::Leading, ReturnType::No)`, but I don't foresee the original envisioned benefit of the `precedence()`/`order()` distinction being retained in this approach. Instead I want to move toward a model that Syn has been using successfully. In Syn, there is a Precedence enum but it differs from rustc in the following ways:
- There are [relatively few variants](https://github.com/dtolnay/syn/blob/2.0.87/src/precedence.rs#L11-L47) compared to rustc's `ExprPrecedence`. For example there is no distinction at the precedence level between returns and closures, or between loops and method calls.
- We distinguish between [leading](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L293) and [trailing](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L309) precedence, taking into account an expression's context such as what token follows it (for various syntactic bail-outs in Rust's grammar, like ambiguities around break-with-value) and how it relates to operators from the surrounding syntax tree.
- There are no hardcoded mysterious integer quantities like rustc's `PREC_CLOSURE = -40`. All precedence comparisons are performed via PartialOrd on a C-like enum.
This PR is just a first step in these changes. As you can tell from Syn, I definitely think there is value in having a dedicated type to represent precedence, instead of what `order()` is doing with `i8`. But that is a whole separate adventure because rustc_ast doesn't even agree consistently on `i8` being the type for precedence order; `AssocOp::precedence` instead uses `usize` and there are casts in both directions. It is likely that a type called `ExprPrecedence` will re-appear, but it will look substantially different from the one that existed before this PR.
Separate `RemoveLet` span into primary span for `let` and removal
suggestion span for `let `, so that primary span does not include
whitespace.
Fixes: #133031
Signed-off-by: Tyrone Wu <wudevelops@gmail.com>
This commit does the following.
- Renames `collect_tokens_trailing_token` as `collect_tokens`, because
(a) it's annoying long, and (b) the `_trailing_token` bit is less
accurate now that its types have changed.
- In `collect_tokens`, adds a `Option<CollectPos>` argument and a
`UsePreAttrPos` in the return type of `f`. These are used in
`parse_expr_force_collect` (for vanilla expressions) and in
`parse_stmt_without_recovery` (for two different cases of expression
statements). Together these ensure are enough to fix all the problems
with token collection and assoc expressions. The changes to the
`stringify.rs` test demonstrate some of these.
- Adds a new test. The code in this test was causing an assertion
failure prior to this commit, due to an invalid `NodeRange`.
The extra complexity is annoying, but necessary to fix the existing
problems.
This pre-existing type is suitable for use with the return value of the
`f` parameter in `collect_tokens_trailing_token`. The more descriptive
name will be useful because the next commit will add another boolean
value to the return value of `f`.
`parse_expr_assoc_with` has an awkward structure -- sometimes the lhs is
already parsed. This commit splits the post-lhs part into a new method
`parse_expr_assoc_rest_with`, which makes everything shorter and
simpler.