interpret/allocation: expose init + write_wildcards on a range
Part of https://github.com/rust-lang/miri/pull/4456, so that we can mark down when a foreign access to our memory happened. Should this also move `prepare_for_native_access()` itself into Miri, given that everything there can be implemented on Miri's side?
r? `````@RalfJung`````
Allow custom default address spaces and parse `p-` specifications in the datalayout string
Some targets, such as CHERI, use as default an address space different from the "normal" default address space `0` (in the case of CHERI, [200 is used](https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-877.pdf)). Currently, `rustc` does not allow to specify custom address spaces and does not take into consideration [`p-` specifications in the datalayout string](https://llvm.org/docs/LangRef.html#langref-datalayout).
This patch tries to mitigate these problems by allowing targets to define a custom default address space (while keeping the default value to address space `0`) and adding the code to parse the `p-` specifications in `rustc_abi`. The main changes are that `TargetDataLayout` now uses functions to refer to pointer-related informations, instead of having specific fields for the size and alignment of pointers in the default address space; furthermore, the two `pointer_size` and `pointer_align` fields in `TargetDataLayout` are replaced with an `FxHashMap` that holds info for all the possible address spaces, as parsed by the `p-` specifications.
The potential performance drawbacks of not having ad-hoc fields for the default address space will be tested in this PR's CI run.
r? workingjubilee
use `is_multiple_of` and `div_ceil`
In tricky logic, these functions are much more informative than the manual implementations. They also catch subtle bugs:
- the manual `is_multiple_of` often does not handle division by zero
- manual `div_ceil` often does not consider overflow
The transformation is free for `is_multiple_of` if the divisor is compile-time known to be non-zero. For `div_ceil` there is a small cost to considering overflow. Here is some assembly https://godbolt.org/z/5zP8KaE1d.
setup typos check in CI
This allows to check typos in CI, currently for compiler only (to reduce commit size with fixes). With current setup, exclude list is quite short, so it worth trying?
Also includes commits with actual typo fixes.
MCP: https://github.com/rust-lang/compiler-team/issues/817
typos check currently turned for:
* ./compiler
* ./library
* ./src/bootstrap
* ./src/librustdoc
After merging, PRs which enables checks for other crates (tools) can be implemented too.
Found typos will **not break** other jobs immediately: (tests, building compiler for perf run). Job will be marked as red on completion in ~ 20 secs, so you will not forget to fix it whenever you want, before merging pr.
Check typos: `python x.py test tidy --extra-checks=spellcheck`
Apply typo fixes: `python x.py test tidy --extra-checks=spellcheck:fix` (in case if there only 1 suggestion of each typo)
Current fail in this pr is expected and shows how typo errors emitted. Commit with error will be removed after r+.
miri: improve errors for type validity assertion failures
Miri has pretty nice errors for type validity violations, printing which field in the type the problem occurs at and so on.
However, we don't see these errors when using e.g. `mem::zeroed` as that uses `assert_zero_valid` to bail out before Miri can detect the UB.
Similar to what we did with `@saethlin's` UB checks, I think we should disable such language UB checks in Miri so that we can get better error messages. If we go for this we should probably say this in the intrinsic docs as well so that people don't think they can rely on these intrinsics catching anything.
Furthermore, I slightly changed `MaybeUninit::assume_init` so that the `.value` field does not show up in error messages any more.
`@rust-lang/miri` what do you think?
give Pointer::into_parts a more scary name and offer a safer alternative
`into_parts` is a bit too innocent of a name for a somewhat subtle operation.
r? `@oli-obk`
Do not include NUL-terminator in computed length
This PR contains just the first commit of rust-lang/rust#142579 which changes it so that the string length stored in the `Location` is the length of the `&str` rather than the length of the `&CStr`. Since most users will want the `&str` length, it seems better to optimize for that use-case.
There should be no visible changes in the behavior or API.
Insert checks for enum discriminants when debug assertions are enabled
Similar to the existing null-pointer and alignment checks, this checks for valid enum discriminants on creation of enums through unsafe transmutes. Essentially this sanitizes patterns like the following:
```rust
let val: MyEnum = unsafe { std::mem::transmute<u32, MyEnum>(42) };
```
An extension of this check will be done in a follow-up that explicitly sanitizes for extern enum values that come into Rust from e.g. C/C++.
This check is similar to Miri's capabilities of checking for valid construction of enum values.
This PR is inspired by saethlin@'s PR
https://github.com/rust-lang/rust/pull/104862. Thank you so much for keeping this code up and the detailed comments!
I also pair-programmed large parts of this together with vabr-g@.
r? `@saethlin`
const checks for lifetime-extended temporaries: avoid 'top-level scope' terminology
This error recently got changed in https://github.com/rust-lang/rust/pull/140942 to use the terminology of "top-level scope", but after further discussion in https://github.com/rust-lang/reference/pull/1865 it seems the reference will not be using that terminology after all. So let's also remove it from the compiler again, and let's focus on what actually happens with these temporaries: their lifetime is extended until the end of the program.
r? ``@oli-obk`` ``@traviscross``
New const traits syntax
This PR only affects the AST and doesn't actually change anything semantically.
All occurrences of `~const` outside of libcore have been replaced by `[const]`. Within libcore we have to wait for rustfmt to be bumped in the bootstrap compiler. This will happen "automatically" (when rustfmt is run) during the bootstrap bump, as rustfmt converts `~const` into `[const]`. After this we can remove the `~const` support from the parser
Caveat discovered during impl: there is no legacy bare trait object recovery for `[const] Trait` as that snippet in type position goes down the slice /array parsing code and will error
r? ``@fee1-dead``
cc ``@nikomatsakis`` ``@traviscross`` ``@compiler-errors``
const-eval: error when initializing a static writes to that static
Fixes https://github.com/rust-lang/rust/issues/142404 by also calling the relevant hook for writes, not just reads. To avoid erroring during the actual write of the initial value, we neuter the hook when popping the final stack frame.
Calling the hook during writes requires changing its signature since we cannot pass in the entire interpreter any more.
While doing this I also realized a gap in https://github.com/rust-lang/rust/pull/142575 for zero-sized copies on the read side, so I fixed that and added a test.
r? `@oli-obk`
Add tracing to `validate_operand`
This PR adds a tracing call to keep track of how much time is spent in `validate_operand` and `const_validate_operand`. Let me know if more fine-grained tracing is needed (e.g. adding tracing to `validate_operand_internal` too, which is just called from those two functions).
I also fixed the rustdoc of `validate_operand` and `const_validate_operand` since it was referencing an older name for the `val` parameter which was renamed in cbdcbf0d6a586792c5e0a0b8965a3179bac56120.
Here is some tracing output when Miri is run on `src/tools/miri/tests/pass/hello.rs`, visualizable in [ui.perfetto.dev](https://ui.perfetto.dev/): [trace-1750932222218210.json](https://github.com/user-attachments/files/20924000/trace-1750932222218210.json)
**Note: obtaining tracing output depends on https://github.com/rust-lang/miri/pull/4406, but this PR is standalone and can be merged without waiting for https://github.com/rust-lang/miri/pull/4406.**
r? `@RalfJung`
Similar to the existing nullpointer and alignment checks, this checks
for valid enum discriminants on creation of enums through unsafe
transmutes. Essentially this sanitizes patterns like the following:
```rust
let val: MyEnum = unsafe { std::mem::transmute<u32, MyEnum>(42) };
```
An extension of this check will be done in a follow-up that explicitly
sanitizes for extern enum values that come into Rust from e.g. C/C++.
This check is similar to Miri's capabilities of checking for valid
construction of enum values.
This PR is inspired by saethlin@'s PR
https://github.com/rust-lang/rust/pull/104862. Thank you so much for
keeping this code up and the detailed comments!
I also pair-programmed large parts of this together with vabr-g@.
make `tidy-alphabetical` use a natural sort
The idea here is that these lines should be correctly sorted, even though a naive string comparison would say they are not:
```
foo2
foo10
```
This is the ["natural sort order"](https://en.wikipedia.org/wiki/Natural_sort_order).
There is more discussion in [#t-compiler/help > tidy natural sort](https://rust-lang.zulipchat.com/#narrow/channel/182449-t-compiler.2Fhelp/topic/tidy.20natural.20sort/with/519111079)
Unfortunately, no standard sorting tools are smart enough to to this automatically (casting some doubt on whether we should make this change). Here are some sort outputs:
```
> cat foo.txt | sort
foo
foo1
foo10
foo2
mp
mp1e2
np",
np1e2",
> cat foo.txt | sort -n
foo
foo1
foo10
foo2
mp
mp1e2
np",
np1e2",
> cat foo.txt | sort -V
foo
foo1
foo2
foo10
mp
mp1e2
np1e2",
np",
```
Disappointingly, "numeric" sort does not actually have the behavior we want. It only sorts by numeric value if the line starts with a number. The "version" sort looks promising, but does something very unintuitive if you look at the final 4 values. None of the other options seem to have the desired behavior in all cases:
```
-b, --ignore-leading-blanks ignore leading blanks
-d, --dictionary-order consider only blanks and alphanumeric characters
-f, --ignore-case fold lower case to upper case characters
-g, --general-numeric-sort compare according to general numerical value
-i, --ignore-nonprinting consider only printable characters
-M, --month-sort compare (unknown) < 'JAN' < ... < 'DEC'
-h, --human-numeric-sort compare human readable numbers (e.g., 2K 1G)
-n, --numeric-sort compare according to string numerical value
-R, --random-sort shuffle, but group identical keys. See shuf(1)
--random-source=FILE get random bytes from FILE
-r, --reverse reverse the result of comparisons
--sort=WORD sort according to WORD:
general-numeric -g, human-numeric -h, month -M,
numeric -n, random -R, version -V
-V, --version-sort natural sort of (version) numbers within text
```
r? ```@Noratrieb``` (it sounded like you know this code?)