393 Commits

Author SHA1 Message Date
Zalathar
81ed042c8c coverage: Remove all unstable support for MC/DC instrumentation 2025-08-06 22:38:52 +10:00
Stuart Cook
8628b78f24
Rollup merge of #144232 - xacrimon:explicit-tail-call, r=WaffleLapkin
Implement support for `become` and explicit tail call codegen for the LLVM backend

This PR implements codegen of explicit tail calls via `become` in `rustc_codegen_ssa` and support within the LLVM backend. Completes a task on (https://github.com/rust-lang/rust/issues/112788). This PR implements all the necessary bits to make explicit tail calls usable, other backends have received stubs for now and will ICE if you use `become` on them. I suspect there is some bikeshedding to be done on how we should go about implementing this for other backends, but it should be relatively straightforward for GCC after this is merged.

During development I also put together a POC bytecode VM based on tail call dispatch to test these changes out and analyze the codegen to make sure it generates expected assembly. That is available [here](https://github.com/xacrimon/tcvm).
2025-07-31 15:42:00 +10:00
Joel Wejdenstål
a448837045
Implement support for explicit tail calls in the MIR block builders and the LLVM codegen backend. 2025-07-26 01:02:29 +02:00
bjorn3
fe2eeabe27 Use the object crate rather than LLVM for extracting bitcode sections 2025-07-25 11:21:28 +00:00
许杰友 Jieyou Xu (Joe)
5e3eb25125
Rollup merge of #142097 - ZuseZ4:offload-host1, r=oli-obk
gpu offload host code generation

r? ghost

This will generate most of the host side code to use llvm's offload feature.
The first PR will only handle automatic mem-transfers to and from the device.
So if a user calls a kernel, we will copy inputs back and forth, but we won't do the actual kernel launch.
Before merging, we will use LLVM's Info infrastructure to verify that the memcopies match what openmp offloa generates in C++. `LIBOMPTARGET_INFO=-1 ./my_rust_binary` should print that a memcpy to and later from the device is happening.

A follow-up PR will generate the actual device-side kernel which will then do computations on the GPU.
A third PR will implement manual host2device and device2host functionality, but the goal is to minimize cases where a user has to overwrite our default handling due to performance issues.

I'm trying to get a full MVP out first, so this just recognizes GPU functions based on magic names. The final frontend will obviously move this over to use proper macros, like I'm already doing it for the autodiff work.
This work will also be compatible with std::autodiff, so one can differentiate GPU kernels.

Tracking:
- https://github.com/rust-lang/rust/issues/131513
2025-07-22 00:54:24 +08:00
Manuel Drehwald
5958ebe829 add various wrappers for gpu code generation 2025-07-18 16:24:12 -07:00
Nikita Popov
12b19be741 Pass wasm exception model to TargetOptions
This is no longer implied by -wasm-enable-eh.
2025-07-18 09:35:50 +02:00
Oli Scherer
e574fef728 Shrink some unsafe blocks in cg_llvm 2025-07-14 08:27:08 +00:00
Oli Scherer
d3d51b4fdb Avoid a bunch of unnecessary unsafe blocks in cg_llvm 2025-07-14 08:27:08 +00:00
bors
855e0fe46e Auto merge of #142911 - mejrs:unsized, r=compiler-errors
Remove support for dynamic allocas

Followup to rust-lang/rust#141811
2025-07-11 05:27:32 +00:00
Trevor Gross
6e3d017b2f
Rollup merge of #143722 - oli-obk:sound-llvm, r=dianqk
Make some "safe" llvm ops actually sound

Noticed while doing other refactorings

it may cause some extra unnecessary allocations, but the current use sites are rare ones anyway
2025-07-10 20:20:39 -04:00
Oli Scherer
84eeca2e2f Make some "safe" llvm ops actually sound 2025-07-10 07:27:41 +00:00
Dillon Amburgey
0455577974 fix: correct parameter names in LLVMRustBuildMinNum and LLVMRustBuildMaxNum FFI declarations 2025-07-08 06:24:19 -05:00
mejrs
49421d1fa3 Remove support for dynamic allocas 2025-07-07 23:04:06 +02:00
Yotam Ofek
3b48407f93 Remove unused allow attrs 2025-07-07 12:58:16 +00:00
klensy
c76d032f01 setup CI and tidy to use typos for spellchecking and fix few typos 2025-07-03 10:51:06 +03:00
Jana Dönszelmann
69b11c64eb
Rollup merge of #142809 - KMJ-007:ad-type-analysis-flag, r=ZuseZ4
Add PrintTAFn flag for targeted type analysis printing

## Summary
This PR adds a new `PrintTAFn` flag to the `-Z autodiff` option that allows printing type analysis information for a specific function, rather than all functions.

## Changes

### New Flag
- Added `PrintTAFn=<function_name>` option to `-Z autodiff`
- Usage: `-Z autodiff=Enable,PrintTAFn=my_function_name`

### Implementation Details
- **Rust side**: Added `PrintTAFn(String)` variant to `AutoDiff` enum
- **Parser**: Updated `parse_autodiff` to handle `PrintTAFn=<function_name>` syntax with proper error handling
- **FFI**: Added `set_print_type_fun` function to interface with Enzyme's `FunctionToAnalyze` command line option
- **Documentation**: Updated help text and documentation for the new flag

### Files Modified
- `compiler/rustc_session/src/config.rs`: Added `PrintTAFn(String)` variant
- `compiler/rustc_session/src/options.rs`: Updated parser and help text (now shows `PrintTAFn` in the list)
- `compiler/rustc_codegen_llvm/src/llvm/enzyme_ffi.rs`: Added FFI function and static variable
- `compiler/rustc_codegen_llvm/src/back/lto.rs`: Added handling for new flag
- `src/doc/rustc-dev-guide/src/autodiff/flags.md`: Updated documentation
- `src/doc/unstable-book/src/compiler-flags/autodiff.md`: Updated documentation

## Testing
The flag can be tested with:
```bash
rustc +enzyme -Z autodiff=Enable,PrintTAFn=square test.rs
```

This will print type analysis information only for the function named "square" instead of all functions.

## Error Handling
The parser includes proper error handling:
- Missing argument: `PrintTAFn` without `=<function_name>` will show an error
- Unknown options: Invalid autodiff options will be reported

r? ```@ZuseZ4```
2025-06-25 22:14:55 +02:00
Karan Janthe
7b1c89f2b5 added PrintTAFn flag for autodiff
Signed-off-by: Karan Janthe <karanjanthe@gmail.com>
2025-06-25 02:11:29 +00:00
sayantn
9415f3d8a6
Use LLVMIntrinsicGetDeclaration to completely remove the hardcoded intrinsics list 2025-06-15 22:15:16 +05:30
sayantn
d56fcd968d
Simplify implementation of Rust intrinsics by using type parameters in the cache 2025-06-12 00:32:42 +05:30
Ralf Jung
a387c86a92 get rid of rustc_codegen_ssa::common::AtomicOrdering 2025-05-28 22:57:55 +02:00
bors
1a7f290a9a Auto merge of #140914 - Zalathar:asm-bindings, r=compiler-errors
cg_llvm: Clean up some inline assembly bindings

This PR combines a few loosely-related cleanups to LLVM bindings related to inline assembly. These include:
- Replacing `LLVMRustInlineAsm` with LLVM-C's `LLVMGetInlineAsm`
- Adjusting FFI declarations to avoid the need for explicit `as_c_char_ptr` conversions
- Flattening control flow in `inline_asm_call`

There should be no functional changes.
2025-05-12 17:39:21 +00:00
Zalathar
dbdbde2a72 Rename OperandBundleOwned to OperandBundleBox
As with `DIBuilderBox`, the "Box" suffix does a better job of communicating
that this is an owning pointer to some borrowable resource.

This also renames the `raw` method to `as_ref`, which is what it would have
been named originally if the `Deref` problem had been known at the time.
2025-05-11 21:21:38 +10:00
Zalathar
b6300294a8 Make LLVMRustInlineAsmVerify take *const c_uchar
This avoids the need for an explicit `as_c_char_ptr` conversion.
2025-05-11 14:38:42 +10:00
Zalathar
b1094f6a0a Add a safe wrapper for LLVMAppendModuleInlineAsm
This patch also changes the Rust-side declaration to take `*const c_uchar`
instead of `*const c_char`, to avoid the need for `AsCCharPtr`.
2025-05-11 14:38:42 +10:00
Zalathar
d1bb310a7a Use LLVMGetInlineAsm
This LLVM-C binding replaces the existing `LLVMRustInlineAsm` function.
2025-05-11 14:37:54 +10:00
Zalathar
8764ecd0c1 Add a searchable tag PTR_LEN_STR to explain *const c_uchar bindings
This module comment describes why it's OK for LLVM bindings to declare a
parameter type of `*const c_uchar` for pointer/length strings, even though the
corresponding parameter on the C/C++ side uses `const char *`.

Adding a searchable term to each such parameter should make it easier for
future maintainers to understand why `*const c_uchar` is being used instead of
`*const c_char`.
2025-05-11 14:26:14 +10:00
Ralf Jung
79dfd0a472 remove 'unordered' atomic intrinsics 2025-05-09 17:39:52 +02:00
bit-aloo
7018392337
remove noinline attribute and add alwaysinline after AD pass 2025-04-28 21:10:32 +05:30
bit-aloo
f319dd909e
add llvm wrappers and corresponding methods in attribute 2025-04-25 11:09:52 +05:30
Manuel Drehwald
75f86e6e2e fix LooseTypes flag and PrintMod behaviour, add debug helper 2025-04-12 01:36:44 -04:00
Stuart Cook
c6bf3a01ef
Rollup merge of #137880 - EnzymeAD:autodiff-batching, r=oli-obk
Autodiff batching

Enzyme supports batching, which is especially known from the ML side when training neural networks.
There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights.
That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations.

Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size,
and then each Dual/Duplicated argument has not one, but N shadow arguments.  So instead of
```rs
for i in 0..100 {
   df(x[i], y[i], 1234);
}
```
You can now do
```rs
for i in 0..100.step_by(4) {
   df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234);
}
```
which will give the same results, but allows better compiler optimizations. See the testcase for details.

There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days.

I will also add more tests for both modes.

For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU
I'll also add some other docs to the dev guide and user docs in another PR.

r? ghost

Tracking:

- https://github.com/rust-lang/rust/issues/124509
- https://github.com/rust-lang/rust/issues/135283
2025-04-05 13:18:13 +11:00
Manuel Drehwald
b7c63a973f add autodiff batching backend 2025-04-04 14:24:23 -04:00
Daniel Paoliello
79b9664091 Reduce visibility of most items in rustc_codegen_llvm 2025-03-25 16:36:47 +11:00
Zalathar
d07ef5b0e1 coverage: Add LLVM plumbing for expansion regions
This is currently unused, but paves the way for future work on expansion
regions without having to worry about the FFI parts.
2025-03-20 12:40:36 +11:00
Matthias Krüger
63c548d82c
Rollup merge of #137549 - oli-obk:llvm-ffi, r=davidtwco
Clean up various LLVM FFI things in codegen_llvm

cc ```@ZuseZ4``` I touched some autodiff parts

The major change of this PR is [bfd88ce](bfd88cead0) which makes `CodegenCx` generic just like `GenericBuilder`

The other commits mostly took advantage of the new feature of making extern functions safe, but also just used some wrappers that were already there and shrunk unsafe blocks.

best reviewed commit-by-commit
2025-03-07 19:15:34 +01:00
Michael Goulet
a59a8f9e75 Revert "Auto merge of #135335 - oli-obk:push-zxwssomxxtnq, r=saethlin"
This reverts commit a7a6c64a657f68113301c2ffe0745b49a16442d1, reversing
changes made to ebbe63891f1fae21734cb97f2f863b08b1d44bf8.
2025-03-02 18:52:48 +00:00
bors
0c72c0d11a Auto merge of #133250 - DianQK:embed-bitcode-pgo, r=nikic
The embedded bitcode should always be prepared for LTO/ThinLTO

Fixes #115344. Fixes #117220.

There are currently two methods for generating bitcode that used for LTO. One method involves using `-C linker-plugin-lto` to emit object files as bitcode, which is the typical setting used by cargo. The other method is through `-C embed-bitcode=yes`.

When using with `-C embed-bitcode=yes -C lto=no`, we run a complete non-LTO LLVM pipeline to obtain bitcode, then the bitcode is used for LTO. We run the Call Graph Profile Pass twice on the same module.

This PR is doing something similar to LLVM's `buildFatLTODefaultPipeline`, obtaining the bitcode for embedding after running `buildThinLTOPreLinkDefaultPipeline`.

r? nikic
2025-03-01 08:22:18 +00:00
许杰友 Jieyou Xu (Joe)
d65f568302
Rollup merge of #137713 - vayunbiyani:fix-enzyme-build-errors, r=oli-obk
Fix enzyme build errors

After [this PR](https://github.com/rust-lang/rust/pull/136428) was merged, I switched to master and attempted building `./x.py build --stage 1 library` with the config mentioned in the enzyme rustbook but it resulted in some errors tho the config.example.toml build succeeded
The errors were re:
### 1. Use of ref in match patterns
The errors were related to match ergonomics in Rust 2024, where ref is no longer needed when matching on references. Examples:
```
error: binding modifiers may only be written when the default binding mode is `move`
   --> compiler/rustc_builtin_macros/src/autodiff.rs:136:31
    |
136 |             Annotatable::Item(ref iitem) => {
    |                               ^^^ binding modifier not allowed under `ref` default binding mode
    |
    = note: for more information, see <https://doc.rust-lang.org/nightly/edition-guide/rust-2024/match-ergonomics.html>
note: matching on a reference type with a non-reference pattern changes the default binding mode
   --> compiler/rustc_builtin_macros/src/autodiff.rs:136:13
    |
136 |             Annotatable::Item(ref iitem) => {
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this matches on type `&_`
help: remove the unnecessary binding modifier
    |
136 -             Annotatable::Item(ref iitem) => {
136 +             Annotatable::Item(iitem) => {
    |

error: binding modifiers may only be written when the default binding mode is `move`
   --> compiler/rustc_builtin_macros/src/autodiff.rs:146:36
    |
146 |             Annotatable::AssocItem(ref assoc_item, _) => {
    |                                    ^^^ binding modifier not allowed under `ref` default binding mode
    |
    = note: for more information, see <https://doc.rust-lang.org/nightly/edition-guide/rust-2024/match-ergonomics.html>
note: matching on a reference type with a non-reference pattern changes the default binding mode
   --> compiler/rustc_builtin_macros/src/autodiff.rs:146:13
    |
146 |             Annotatable::AssocItem(ref assoc_item, _) => {
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this matches on type `&_`
help: remove the unnecessary binding modifier
    |
146 -             Annotatable::AssocItem(ref assoc_item, _) => {
146 +             Annotatable::AssocItem(assoc_item, _) => {
    |

error: binding modifiers may only be written when the default binding mode is `move`
   --> compiler/rustc_builtin_macros/src/autodiff.rs:174:31
    |
174 | ...       Annotatable::Item(ref iitem) => (iitem.vis.clone(), iitem.ide...
    |                             ^^^ binding modifier not allowed under `ref` default binding mode
    |
    = note: for more information, see <https://doc.rust-lang.org/nightly/edition-guide/rust-2024/match-ergonomics.html>
note: matching on a reference type with a non-reference pattern changes the default binding mode
   --> compiler/rustc_builtin_macros/src/autodiff.rs:174:13
    |
174 | ...   Annotatable::Item(ref iitem) => (iitem.vis.clone(), iitem.ident.c...
    |       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this matches on type `&_`
help: remove the unnecessary binding modifier
    |
174 -             Annotatable::Item(ref iitem) => (iitem.vis.clone(), iitem.ident.clone()),
174 +             Annotatable::Item(iitem) => (iitem.vis.clone(), iitem.ident.clone()),
    |

error: binding modifiers may only be written when the default binding mode is `move`
   --> compiler/rustc_builtin_macros/src/autodiff.rs:175:36
    |
175 |             Annotatable::AssocItem(ref assoc_item, _) => {
    |                                    ^^^ binding modifier not allowed under `ref` default binding mode
    |
    = note: for more information, see <https://doc.rust-lang.org/nightly/edition-guide/rust-2024/match-ergonomics.html>
note: matching on a reference type with a non-reference pattern changes the default binding mode
   --> compiler/rustc_builtin_macros/src/autodiff.rs:175:13
    |
175 |             Annotatable::AssocItem(ref assoc_item, _) => {
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ this matches on type `&_`
help: remove the unnecessary binding modifier
    |
175 -             Annotatable::AssocItem(ref assoc_item, _) => {
175 +             Annotatable::AssocItem(assoc_item, _) => {
    |

error: could not compile `rustc_builtin_macros` (lib) due to 4 previous errors
warning: build failed, waiting for other jobs to finish...
Build completed unsuccessfully in 0:19:39
```
### 2. the use of external C blocks without unsafe in compiler/rustc_codegen_llvm/src/llvm/enzyme_ffi.rs (I don't have the error message handy)

The first commit fixes the errors above

---
## Additional Improvement:
`@ZuseZ4` suggested we consolidate the variants under `#[cfg(llvm_enzyme)]` and `#[cfg(not(llvm_enzyme))]` by conditionally checking for `cfg!(llvm_enzyme)` instead. This way,  the autodiff code is compiled but not executed avoiding such regressions

r? `@ZuseZ4`
cc: `@oli-obk`
2025-02-28 21:42:01 +08:00
Vayun Biyani
cb53e97870 Fix enzyme build errors 2025-02-25 17:25:50 +05:30
Oli Scherer
553828c6f4 Mark more LLVM FFI as safe 2025-02-24 15:11:29 +00:00
Oli Scherer
3565603d25 Use a safe wrapper around an LLVM FFI function 2025-02-24 15:11:29 +00:00
Oli Scherer
396baa750e Make allocator shim creation mostly use safe code 2025-02-24 15:11:29 +00:00
Oli Scherer
a54bfcf52b Use safe FFI for various functions in codegen_llvm 2025-02-24 15:05:56 +00:00
David Wood
a5615d3c62
codegen_llvm: avoid Deref impls w/ extern type
`rustc_codegen_llvm` relied on `Deref` impls where `Deref::Target` was
or contained an extern type - in my experimental implementation of
rust-lang/rfcs#3729, this isn't possible as the `Target` associated
type's `?Sized` bound cannot be relaxed backwards compatibly (unless we
come up with some way of doing this).

In later pull requests with the rust-lang/rfcs#3729 implementation,
breakage like this could only occur for nightly users relying on the
`extern_types` feature.

Upstreaming this to avoid needing to keep carrying this patch locally,
and I think it'll necessarily need to change eventually.
2025-02-24 08:08:55 +00:00
bors
e0be1a0262 Auto merge of #137271 - nikic:gep-nuw-2, r=scottmcm
Emit getelementptr inbounds nuw for pointer::add()

Lower pointer::add (via intrinsic::offset with unsigned offset) to getelementptr inbounds nuw on LLVM versions that support it. This lets LLVM make use of the pre-condition that the offset addition does not wrap in an unsigned sense. Together with inbounds, this also implies that the offset is non-negative.

Fixes https://github.com/rust-lang/rust/issues/137217.
2025-02-24 03:06:16 +00:00
DianQK
1a99ca8da9
The embedded bitcode should always be prepared for LTO/ThinLTO 2025-02-23 21:23:36 +08:00
bors
15469f8f8a Auto merge of #137420 - matthiaskrgr:rollup-rr0q37f, r=matthiaskrgr
Rollup of 9 pull requests

Successful merges:

 - #136910 (Implement feature `isolate_most_least_significant_one` for integer types)
 - #137183 (Prune dead regionck code)
 - #137333 (Use `edition = "2024"` in the compiler (redux))
 - #137356 (Ferris 🦀 Identifier naming conventions)
 - #137362 (Add build step log for `run-make-support`)
 - #137377 (Always allow reusing cratenum in CrateLoader::load)
 - #137388 (Fix(lib/fs/tests): Disable rename POSIX semantics FS tests under Windows 7)
 - #137410 (Use StableHasher + Hash64 for dep_tracking_hash)
 - #137413 (jubilee cleared out the review queue)

r? `@ghost`
`@rustbot` modify labels: rollup
2025-02-22 13:32:44 +00:00
Manuel Drehwald
e2d250c3f6 update autodiff flags 2025-02-21 21:51:20 -05:00
Michael Goulet
e1819a889a Fix overcapturing, unsafe extern blocks, and new unsafe ops 2025-02-22 00:01:48 +00:00