258 Commits

Author SHA1 Message Date
Stuart Cook
cd6f32a4eb
Rollup merge of #147134 - workingjubilee:remove-explicit-abialign-deref, r=Zalathar
remove explicit deref of AbiAlign for most methods

Much of the compiler calls functions on Align projected from AbiAlign. AbiAlign impls Deref to its inner Align, so we can simplify these away. Also, it will minimize disruption when AbiAlign is removed.

For now, preserve usages that might resolve to PartialOrd or PartialEq, as those have odd inference.
2025-09-29 15:44:55 +10:00
Stuart Cook
6c40c16d83
Rollup merge of #147116 - workingjubilee:remove-tdl-abialign, r=Zalathar
compiler: remove AbiAlign inside TargetDataLayout

AbiAlign is a thin wrapper around Align, extant mostly because we used to track a separate quasi-notion of alignment that was never a real notion of alignment and removing all of it at once was too churny. This PR maintains AbiAlign usage in public API and most of the compiler, but direct access of these fields for TargetDataLayout is now in terms of Align only.
2025-09-29 15:44:54 +10:00
Jubilee Young
0c9d0dfe04 remove explicit deref of AbiAlign for most methods
Much of the compiler calls functions on Align projected from AbiAlign.
AbiAlign impls Deref to its inner Align, so we can simplify these away.
Also, it will minimize disruption when AbiAlign is removed.

For now, preserve usages that might resolve to PartialOrd or PartialEq,
as those have odd inference.
2025-09-28 15:02:14 -07:00
Matthias Krüger
c29fb2e57e
Rollup merge of #144197 - KMJ-007:type-tree, r=ZuseZ4
TypeTree support in autodiff

# TypeTrees for Autodiff

## What are TypeTrees?
Memory layout descriptors for Enzyme. Tell Enzyme exactly how types are structured in memory so it can compute derivatives efficiently.

## Structure
```rust
TypeTree(Vec<Type>)

Type {
    offset: isize,  // byte offset (-1 = everywhere)
    size: usize,    // size in bytes
    kind: Kind,     // Float, Integer, Pointer, etc.
    child: TypeTree // nested structure
}
```

## Example: `fn compute(x: &f32, data: &[f32]) -> f32`

**Input 0: `x: &f32`**
```rust
TypeTree(vec![Type {
    offset: -1, size: 8, kind: Pointer,
    child: TypeTree(vec![Type {
        offset: -1, size: 4, kind: Float,
        child: TypeTree::new()
    }])
}])
```

**Input 1: `data: &[f32]`**
```rust
TypeTree(vec![Type {
    offset: -1, size: 8, kind: Pointer,
    child: TypeTree(vec![Type {
        offset: -1, size: 4, kind: Float,  // -1 = all elements
        child: TypeTree::new()
    }])
}])
```

**Output: `f32`**
```rust
TypeTree(vec![Type {
    offset: -1, size: 4, kind: Float,
    child: TypeTree::new()
}])
```

## Why Needed?
- Enzyme can't deduce complex type layouts from LLVM IR
- Prevents slow memory pattern analysis
- Enables correct derivative computation for nested structures
- Tells Enzyme which bytes are differentiable vs metadata

## What Enzyme Does With This Information:

Without TypeTrees (current state):
```llvm
; Enzyme sees generic LLVM IR:
define float ``@distance(ptr*`` %p1, ptr* %p2) {
; Has to guess what these pointers point to
; Slow analysis of all memory operations
; May miss optimization opportunities
}
```

With TypeTrees (our implementation):
```llvm
define "enzyme_type"="{[]:Float@float}" float ``@distance(``
    ptr "enzyme_type"="{[]:Pointer}" %p1,
    ptr "enzyme_type"="{[]:Pointer}" %p2
) {
; Enzyme knows exact type layout
; Can generate efficient derivative code directly
}
```

# TypeTrees - Offset and -1 Explained

## Type Structure

```rust
Type {
    offset: isize, // WHERE this type starts
    size: usize,   // HOW BIG this type is
    kind: Kind,    // WHAT KIND of data (Float, Int, Pointer)
    child: TypeTree // WHAT'S INSIDE (for pointers/containers)
}
```

## Offset Values

### Regular Offset (0, 4, 8, etc.)
**Specific byte position within a structure**

```rust
struct Point {
    x: f32, // offset 0, size 4
    y: f32, // offset 4, size 4
    id: i32, // offset 8, size 4
}
```

TypeTree for `&Point` (internal representation):
```rust
TypeTree(vec![
    Type { offset: 0, size: 4, kind: Float },   // x at byte 0
    Type { offset: 4, size: 4, kind: Float },   // y at byte 4
    Type { offset: 8, size: 4, kind: Integer }  // id at byte 8
])
```

Generates LLVM:
```llvm
"enzyme_type"="{[]:Float@float}"
```

### Offset -1 (Special: "Everywhere")
**Means "this pattern repeats for ALL elements"**

#### Example 1: Array `[f32; 100]`
```rust
TypeTree(vec![Type {
    offset: -1, // ALL positions
    size: 4,    // each f32 is 4 bytes
    kind: Float, // every element is float
}])
```

Instead of listing 100 separate Types with offsets `0,4,8,12...396`

#### Example 2: Slice `&[i32]`
```rust
// Pointer to slice data
TypeTree(vec![Type {
    offset: -1, size: 8, kind: Pointer,
    child: TypeTree(vec![Type {
        offset: -1, // ALL slice elements
        size: 4,    // each i32 is 4 bytes
        kind: Integer
    }])
}])
```

#### Example 3: Mixed Structure
```rust
struct Container {
    header: i64,        // offset 0
    data: [f32; 1000],  // offset 8, but elements use -1
}
```

```rust
TypeTree(vec![
    Type { offset: 0, size: 8, kind: Integer }, // header
    Type { offset: 8, size: 4000, kind: Pointer,
        child: TypeTree(vec![Type {
            offset: -1, size: 4, kind: Float // ALL array elements
        }])
    }
])
```
2025-09-28 18:13:11 +02:00
Jubilee Young
b3f3e36c72 compiler: remove AbiAlign inside TargetDataLayout
This maintains AbiAlign usage in public API and most of the compiler,
but direct access of these fields is now in terms of Align only.
2025-09-27 22:13:53 -07:00
Ben Kimock
888679013d Add panic=immediate-abort 2025-09-21 13:12:18 -04:00
Karan Janthe
375e14ef49 Add TypeTree metadata attachment for autodiff
- Add F128 support to TypeTree Kind enum
  - Implement TypeTree FFI bindings and conversion functions
  - Add typetree.rs module for metadata attachment to LLVM functions
  - Integrate TypeTree generation with autodiff intrinsic pipeline
  - Support scalar types: f32, f64, integers, f16, f128
  - Attach enzyme_type attributes as LLVM string metadata for Enzyme

Signed-off-by: Karan Janthe <karanjanthe@gmail.com>
2025-09-19 04:02:19 +00:00
bors
97a987f14c Auto merge of #142544 - Sa4dUs:prevent-abi-changes, r=ZuseZ4
Prevent ABI changes affect EnzymeAD

This PR handles ABI changes for autodiff input arguments to improve Enzyme compatibility. Fundamentally this adjusts activities when a function argument is lowered as an `ScalarPair`, so there's no mismatch between diff activities and args. Also removes activities corresponding to ZSTs.

fixes: https://github.com/rust-lang/rust/issues/144025

r? `@ZuseZ4`
2025-09-18 07:32:49 +00:00
Marcelo Domínguez
e04567c363 Check ZST via PassMode 2025-09-17 13:58:17 +00:00
sayantn
62b4347e80
Add funnel_sh{l,r} functions and intrinsics
- Add a fallback implementation for the intrinsics
 - Add LLVM backend support for funnel shifts

Co-Authored-By: folkertdev <folkert@folkertdev.nl>
2025-09-03 14:13:24 +05:30
Folkert de Vries
b32f4d5792
remove an as cast in prefetch codegen 2025-08-21 11:28:10 +02:00
Folkert de Vries
d25910eaeb
make prefetch intrinsics safe 2025-08-20 00:35:42 +02:00
Marcelo Domínguez
250d77e5d7 Complete functionality and general cleanup 2025-08-14 16:30:15 +00:00
Marcelo Domínguez
5c631041aa Basic implementation of autodiff intrinsic 2025-08-14 16:29:58 +00:00
Tobias Decking
948c7952b8
Unify LLVM ctlz/cttz intrinsic generation 2025-07-25 17:56:10 +02:00
Edoardo Marangoni
93f1201c06
compiler: Parse p- specs in datalayout string, allow definition of custom default data address space 2025-07-07 09:04:53 +02:00
Urgau
51857ade80 Always use the pure Rust fallback instead of llvm.{maximum,minimum} 2025-07-03 21:04:18 +02:00
klensy
c76d032f01 setup CI and tidy to use typos for spellchecking and fix few typos 2025-07-03 10:51:06 +03:00
Guillaume Gomez
66ad1f2abf
Rollup merge of #142078 - sayantn:more-intrinsics, r=workingjubilee
Add SIMD funnel shift and round-to-even intrinsics

This PR adds 3 new SIMD intrinsics

 - `simd_funnel_shl` - funnel shift left
 - `simd_funnel_shr` - funnel shift right
 - `simd_round_ties_even` (vector version of `round_ties_even_fN`)

TODO (future PR): implement `simd_fsh{l,r}` in miri, cg_gcc and cg_clif (it is surprisingly hard to implement without branches, the common tricks that rotate uses doesn't work because we have 2 elements now. e.g, the `-n&31` trick used by cg_gcc to implement rotate doesn't work with this because then `fshl(a, b, 0)` will be `a | b`)

[#t-compiler > More SIMD intrinsics](https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/More.20SIMD.20intrinsics/with/522130286)

`@rustbot` label T-compiler T-libs A-intrinsics F-core_intrinsics
r? `@workingjubilee`
2025-06-29 12:29:53 +02:00
sayantn
a9500d6b0b
Correctly account for different address spaces in LLVM intrinsic invocations 2025-06-15 22:45:26 +05:30
sayantn
9415f3d8a6
Use LLVMIntrinsicGetDeclaration to completely remove the hardcoded intrinsics list 2025-06-15 22:15:16 +05:30
sayantn
2038405ff7
Add simd_funnel_sh{l,r} and simd_round_ties_even 2025-06-15 04:33:41 +05:30
sayantn
d56fcd968d
Simplify implementation of Rust intrinsics by using type parameters in the cache 2025-06-12 00:32:42 +05:30
bjorn3
2e8401ae5f Remove type_test from IntrinsicCallBuilderMethods
It is only used within cg_llvm.
2025-06-03 10:00:56 +00:00
bjorn3
284bec5428 Directly use from_immediate for handling bool 2025-05-30 10:12:57 +00:00
bjorn3
0fcea3db28 Avoid computing function type for intrinsic instances 2025-05-30 10:12:18 +00:00
bjorn3
38a6daeb23 Use layout field of OperandRef in generic_simd_intrinsic 2025-05-30 10:12:18 +00:00
bjorn3
1f717ae778 Use layout field of OperandRef and PlaceRef in codegen_intrinsic_call
This avoids having to get the function signature.
2025-05-30 10:12:16 +00:00
bjorn3
0a14e1b2e7 Remove usage of FnAbi in codegen_intrinsic_call 2025-05-26 10:13:03 +00:00
bjorn3
6016f84e71 Pass PlaceRef rather than Bx::Value to codegen_intrinsic_call 2025-05-26 10:13:03 +00:00
Urgau
7f0ae5e3ad Use the fallback body for {minimum,maximum}f128 on LLVM as well. 2025-05-10 17:34:54 +02:00
Urgau
e7247df590 Use intrinsics for {f16,f32,f64,f128}::{minimum,maximum} operations 2025-05-09 17:11:23 +02:00
Michael Goulet
833c212b81 Rename Instance::new to Instance::new_raw and add a note that it is raw 2025-05-05 13:17:35 +00:00
Chris Denton
d15c603173
Rollup merge of #137953 - RalfJung:simd-intrinsic-masks, r=WaffleLapkin
simd intrinsics with mask: accept unsigned integer masks, and fix some of the errors

It's not clear at all why the mask would have to be signed, it is anyway interpreted bitwise. The backend should just make sure that works no matter the surface-level type; our LLVM backend already does this correctly. The note of "the mask may be widened, which only has the correct behavior for signed integers" explains... nothing? Why can't the code do the widening correctly? If necessary, just cast to the signed type first...

Also while we are at it, fix the errors. For simd_masked_load/store, the errors talked about the "third argument" but they meant the first argument (the mask is the first argument there). They also used the wrong type for `expected_element`.

I have extremely low confidence in the GCC part of this PR.

See [discussion on Zulip](https://rust-lang.zulipchat.com/#narrow/channel/257879-project-portable-simd/topic/On.20the.20sign.20of.20masks)
2025-04-20 13:02:48 +00:00
Ralf Jung
566dfd1a0d simd intrinsics with mask: accept unsigned integer masks 2025-04-20 12:25:27 +02:00
Stuart Cook
45ebc4060b
Rollup merge of #137447 - folkertdev:simd-extract-insert-dyn, r=scottmcm
add `core::intrinsics::simd::{simd_extract_dyn, simd_insert_dyn}`

fixes https://github.com/rust-lang/rust/issues/137372

adds `core::intrinsics::simd::{simd_extract_dyn, simd_insert_dyn}`, which contrary to their non-dyn counterparts allow a non-const index. Many platforms (but notably not x86_64 or aarch64) have dedicated instructions for this operation, which stdarch can emit with this change.

Future work is to also make the `Index` operation on the `Simd` type emit this operation, but the intrinsic can't be used directly. We'll need some MIR shenanigans for that.

r? `@ghost`
2025-04-11 13:31:43 +10:00
Folkert de Vries
59c55339af
add simd_insert_dyn and simd_extract_dyn 2025-04-10 21:22:07 +02:00
bjorn3
b754ef727c Remove implicit #[no_mangle] for #[rustc_std_internal_symbol] 2025-03-17 14:08:09 +00:00
Matthias Krüger
63c548d82c
Rollup merge of #137549 - oli-obk:llvm-ffi, r=davidtwco
Clean up various LLVM FFI things in codegen_llvm

cc ```@ZuseZ4``` I touched some autodiff parts

The major change of this PR is [bfd88ce](bfd88cead0) which makes `CodegenCx` generic just like `GenericBuilder`

The other commits mostly took advantage of the new feature of making extern functions safe, but also just used some wrappers that were already there and shrunk unsafe blocks.

best reviewed commit-by-commit
2025-03-07 19:15:34 +01:00
Ralf Jung
aac65f562b rename BackendRepr::Vector → SimdVector 2025-02-28 17:17:45 +01:00
León Orell Valerian Liehr
1511ccd6f8
Rollup merge of #137595 - folkertdev:remove-simd-pow-powi, r=RalfJung
remove `simd_fpow` and `simd_fpowi`

Discussed in https://github.com/rust-lang/rust/issues/137555

These functions are not exposed from `std::intrinsics::simd`, and not used anywhere outside of the compiler. They also don't lower to particularly good code at least on the major ISAs (I checked x86_64, aarch64, s390x, powerpc), where the vector is just spilled to the stack and scalar functions are used for the actual logic.

r? `@RalfJung`
2025-02-25 13:07:40 +01:00
Folkert de Vries
60a268998c
remove simd_fpow and simd_fpowi 2025-02-25 09:20:10 +01:00
Ralf Jung
0362775fb5 rename simd_shuffle_generic → simd_shuffle_const_generic 2025-02-24 19:13:23 +01:00
Oli Scherer
3565603d25 Use a safe wrapper around an LLVM FFI function 2025-02-24 15:11:29 +00:00
Trevor Gross
a2bb4d748d
Rollup merge of #136543 - RalfJung:round-ties-even, r=tgross35
intrinsics: unify rint, roundeven, nearbyint in a single round_ties_even intrinsic

LLVM has three intrinsics here that all do the same thing (when used in the default FP environment). There's no reason Rust needs to copy that historically-grown mess -- let's just have one intrinsic and leave it up to the LLVM backend to decide how to lower that.

Suggested by `@hanna-kruppe` in https://github.com/rust-lang/rust/issues/136459; Cc `@tgross35`

try-job: test-various
2025-02-23 14:30:25 -05:00
Zachary S
7ba3d7b54e Remove BackendRepr::Uninhabited, replaced with an uninhabited: bool field in LayoutData.
Also update comments that refered to BackendRepr::Uninhabited.
2025-02-20 13:27:32 -06:00
Jubilee Young
2d2de18166 compiler: Stop reexporting stuff in cg_llvm::abi
The reexports confuse tooling like rustdoc into thinking cg_llvm is
the source of key types that originate in rustc_target.
2025-02-18 00:31:29 -08:00
Oli Scherer
dcf1e4d72b Document some safety constraints and use more safe wrappers 2025-02-11 09:47:13 +00:00
bjorn3
1fcae03369 Rustfmt 2025-02-08 22:12:13 +00:00
Ralf Jung
04e7a10af6 intrinsics: unify rint, roundeven, nearbyint in a single round_ties_even intrinsic 2025-02-04 16:27:29 +01:00