Allow custom default address spaces and parse `p-` specifications in the datalayout string
Some targets, such as CHERI, use as default an address space different from the "normal" default address space `0` (in the case of CHERI, [200 is used](https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-877.pdf)). Currently, `rustc` does not allow to specify custom address spaces and does not take into consideration [`p-` specifications in the datalayout string](https://llvm.org/docs/LangRef.html#langref-datalayout).
This patch tries to mitigate these problems by allowing targets to define a custom default address space (while keeping the default value to address space `0`) and adding the code to parse the `p-` specifications in `rustc_abi`. The main changes are that `TargetDataLayout` now uses functions to refer to pointer-related informations, instead of having specific fields for the size and alignment of pointers in the default address space; furthermore, the two `pointer_size` and `pointer_align` fields in `TargetDataLayout` are replaced with an `FxHashMap` that holds info for all the possible address spaces, as parsed by the `p-` specifications.
The potential performance drawbacks of not having ad-hoc fields for the default address space will be tested in this PR's CI run.
r? workingjubilee
Fix short linker error output
This PR does 2 things:
- It removes the braces when there's a single element. This is required since brace expansion (at least in bash and zsh) only triggers if there's at least 2 elements.
- It removes the extra `.rlib` suffixes of the elements. See https://github.com/rust-lang/rust/pull/135707#discussion_r2185212393 for context.
Running `cargo +stage1 build` on the following program:
```rust
unsafe extern "C" {
fn foo() -> libc::c_int;
}
fn main() {
let x = unsafe { foo() } as u32;
// println!("{}", data_encoding::BASE64.encode(&x.to_le_bytes()));
}
```
Gives the following diff before and after the PR:
```diff
-/tmp/foo/target/debug/deps/{liblibc-faf416f178830595.rlib}.rlib
+/tmp/foo/target/debug/deps/liblibc-faf416f178830595.rlib
```
Running on the same program with the additional dependency, we get the following diff:
```diff
-/tmp/foo/target/debug/deps/{liblibc-faf416f178830595.rlib,libdata_encoding-84bb5aadfa9e8839.rlib}.rlib
+/tmp/foo/target/debug/deps/{liblibc-faf416f178830595,libdata_encoding-84bb5aadfa9e8839}.rlib
```
This PR does 2 things:
- It removes the braces when there's a single element. This is required since brace expansion (at
least in bash and zsh) only triggers if there's at least 2 elements.
- It removes the extra `.rlib` suffixes of the elements. See
https://github.com/rust-lang/rust/pull/135707#discussion_r2185212393 for context.
Running `cargo +stage1 build` on the following program:
```rust
unsafe extern "C" {
fn foo() -> libc::c_int;
}
fn main() {
let x = unsafe { foo() } as u32;
// println!("{}", data_encoding::BASE64.encode(&x.to_le_bytes()));
}
```
Gives the following diff before and after the PR:
```diff
-/tmp/foo/target/debug/deps/{liblibc-faf416f178830595.rlib}.rlib
+/tmp/foo/target/debug/deps/liblibc-faf416f178830595.rlib
```
Running on the same program with the additional dependency, we get the following diff:
```diff
-/tmp/foo/target/debug/deps/{liblibc-faf416f178830595.rlib,libdata_encoding-84bb5aadfa9e8839.rlib}.rlib
+/tmp/foo/target/debug/deps/{liblibc-faf416f178830595,libdata_encoding-84bb5aadfa9e8839}.rlib
```
Do we want to add a UI test?
Most uses of it either contain a fat or thin lto module. Only
WorkItem::LTO could contain both, but splitting that enum variant
doesn't complicate things much.
setup typos check in CI
This allows to check typos in CI, currently for compiler only (to reduce commit size with fixes). With current setup, exclude list is quite short, so it worth trying?
Also includes commits with actual typo fixes.
MCP: https://github.com/rust-lang/compiler-team/issues/817
typos check currently turned for:
* ./compiler
* ./library
* ./src/bootstrap
* ./src/librustdoc
After merging, PRs which enables checks for other crates (tools) can be implemented too.
Found typos will **not break** other jobs immediately: (tests, building compiler for perf run). Job will be marked as red on completion in ~ 20 secs, so you will not forget to fix it whenever you want, before merging pr.
Check typos: `python x.py test tidy --extra-checks=spellcheck`
Apply typo fixes: `python x.py test tidy --extra-checks=spellcheck:fix` (in case if there only 1 suggestion of each typo)
Current fail in this pr is expected and shows how typo errors emitted. Commit with error will be removed after r+.
fix bitcast of single-element SIMD vectors
in effect this reverts https://github.com/rust-lang/rust/pull/142768 and adds additional tests. That PR relaxed the conditions on an early return in an incorrect way that would create broken LLVM IR.
https://godbolt.org/z/PaaGWTv5a
```rust
#![feature(repr_simd)]
#[repr(simd)]
#[derive(Clone, Copy)]
struct S([i64; 1]);
#[no_mangle]
pub extern "C" fn single_element_simd(b: S) -> i64 {
unsafe { std::mem::transmute(b) }
}
```
at the time of writing generates this LLVM IR, where the type of the return is different from the function's return type.
```llvm
define noundef i64 ``````@single_element_simd(<1`````` x i64> %b) unnamed_addr {
start:
ret <1 x i64> %b
}
```
The test output is actually the same for the existing tests, showing that the change didn't actually matter for any tested behavior. It is probably a bit faster to do the early return, but, well, it's incorrect in general.
zullip thread: [#t-compiler > Is transmuting a `T` to `Tx1` (one-element SIMD vector) UB?](https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/Is.20transmuting.20a.20.60T.60.20to.20.60Tx1.60.20.28one-element.20SIMD.20vector.29.20UB.3F/with/526262799)
cc ``````@sayantn``````
r? ``````@scottmcm``````
inherit `#[align]` from trait method prototypes
````@workingjubilee```` this seems straightforward enough. Now that we're planning to make `-Cmin-function-alignment` a target modifier, I don't think there are any cross-crate complications here?
````@Jules-Bertholet```` is this the behavior you had in mind? In particular the inheritance of the attribute of a default impl is maybe a bit unintuitive at first? (but I think it's ok if that behavior is explicitly documented).
r? ghost
Improve documentation of `TagEncoding`
This PR is follow-up from the [discussion here](https://rust-lang.zulipchat.com/#narrow/channel/182449-t-compiler.2Fhelp/topic/.E2.9C.94.20VariantId.3DDiscriminant.20when.20tag.20is.20niche.20encoded.3F/with/524384295).
It aims at making the `TagEncoding` documentation less ambiguous and more detailed with references to relevant implementation sides. It especially clears up the ambiguous use of discriminant/variant index, which sparked the discussion referenced above.
PS: While working with layout data, I somehow ended up looking at the docs for `FakeBorrowKind` and noticed that the one example was not in a doc comment. I hope that this is minor enough of a fix for it to be okay in this otherwise unrelated PR.
Insert checks for enum discriminants when debug assertions are enabled
Similar to the existing null-pointer and alignment checks, this checks for valid enum discriminants on creation of enums through unsafe transmutes. Essentially this sanitizes patterns like the following:
```rust
let val: MyEnum = unsafe { std::mem::transmute<u32, MyEnum>(42) };
```
An extension of this check will be done in a follow-up that explicitly sanitizes for extern enum values that come into Rust from e.g. C/C++.
This check is similar to Miri's capabilities of checking for valid construction of enum values.
This PR is inspired by saethlin@'s PR
https://github.com/rust-lang/rust/pull/104862. Thank you so much for keeping this code up and the detailed comments!
I also pair-programmed large parts of this together with vabr-g@.
r? `@saethlin`