itsscb/rust - rust - Gitea: Git with a cup of tea

mirror of https://github.com/rust-lang/rust.git synced 2025-11-01 13:34:38 +00:00

Author	SHA1	Message	Date
Manuel Drehwald	4a1a5a4295	gpu host code generation	2025-07-18 16:30:42 -07:00
Manuel Drehwald	634016478e	add -Zoffload=Enable flag behind -Zunstable-options, to enable gpu (host) code generation	2025-07-18 16:24:00 -07:00
León Orell Valerian Liehr	be5f8f299d	Rollup merge of #143388 - bjorn3:lto_refactors, r=compiler-errors Various refactors to the LTO handling code In particular reducing the sharing of code paths between fat and thin-LTO and making the fat LTO implementation more self-contained. This also moves some autodiff handling out of cg_ssa into cg_llvm given that Enzyme only works with LLVM anyway and an implementation for another backend may do things entirely differently. This will also make it a bit easier to split LTO handling out of the coordinator thread main loop into a separate loop, which should reduce the complexity of the coordinator thread.	2025-07-17 03:58:28 +02:00
Oli Scherer	d3d51b4fdb	Avoid a bunch of unnecessary `unsafe` blocks in cg_llvm	2025-07-14 08:27:08 +00:00
Matthias Krüger	7f3204f34d	Rollup merge of #143633 - dillona:noinline-assert, r=fee1-dead fix: correct assertion to check for 'noinline' attribute presence before removal	2025-07-11 19:45:23 +02:00
Oli Scherer	84eeca2e2f	Make some "safe" llvm ops actually sound	2025-07-10 07:27:41 +00:00
Dillon Amburgey	b2299e20b2	fix: correct assertion to check for 'noinline' attribute presence before removal	2025-07-08 06:24:50 -05:00
bjorn3	7fd78df346	Move dcx creation into WriteBackendMethods::codegen	2025-07-03 14:43:09 +00:00
bjorn3	653bb64c75	Remove LtoModuleCodegen Most uses of it either contain a fat or thin lto module. Only WorkItem::LTO could contain both, but splitting that enum variant doesn't complicate things much.	2025-07-03 14:28:18 +00:00
Karan Janthe	7b1c89f2b5	added PrintTAFn flag for autodiff Signed-off-by: Karan Janthe <karanjanthe@gmail.com>	2025-06-25 02:11:29 +00:00
bjorn3	f0707fad31	Mark all optimize methods and the codegen method as safe There is no safety contract and I don't think any of them can actually cause UB in more ways than passing malicious source code to rustc can. While LtoModuleCodegen::optimize says that the returned ModuleCodegen points into the LTO module, the LTO module has already been dropped by the time this function returns, so if the returned ModuleCodegen indeed points into the LTO module, we would have seen crashes on every LTO compilation, which we don't. As such the comment is outdated.	2025-05-28 20:55:00 +00:00
Zalathar	b1094f6a0a	Add a safe wrapper for `LLVMAppendModuleInlineAsm` This patch also changes the Rust-side declaration to take `const c_uchar` instead of `const c_char`, to avoid the need for `AsCCharPtr`.	2025-05-11 14:38:42 +10:00
Bryanskiy	14535312b5	Initial support for dynamically linked crates	2025-05-04 22:03:15 +03:00
bit-aloo	7018392337	remove noinline attribute and add alwaysinline after AD pass	2025-04-28 21:10:32 +05:30
Matthias Krüger	c3f811f02f	Rollup merge of #139700 - EnzymeAD:autodiff-flags, r=oli-obk Autodiff flags Interestingly, it seems that some other projects have conflicts with exactly the same LLVM optimization passes as autodiff. At least `LLVMRustOptimize` has exactly the flags that we need to disable problematic opt passes. This PR enables us to compile code where users differentiate two identical functions in the same module. This has been especially common in test cases, but it's not impossible to encounter in the wild. It also enables two new flags for testing/debugging. I consider writing an MCP to upgrade PrintPasses to be a standalone -Z flag, since it is not the same as `-Z print-llvm-passes`, which IMHO gives less useful output. A discussion can be found here: [#t-compiler/llvm > Print llvm passes. @ 💬](https://rust-lang.zulipchat.com/#narrow/channel/187780-t-compiler.2Fllvm/topic/Print.20llvm.20passes.2E/near/511533038) Finally, it improves `PrintModBefore` and `PrintModAfter`. They used to work reliable, but now we just schedule enzyme as part of an existing ModulePassManager (MPM). Since Enzyme is last in the MPM scheduling, PrintModBefore became very inaccurate. It used to print the input module, which we gave to the Enzyme and was great to create llvm-ir reproducer. However, lately the MPM would run the whole `default<O3>` pipeline, which heavily modifies the llvm module, before we pass it to Enzyme. That made it impossible to use the flag to create llvm-ir reproducers for Enzyme bugs. We now schedule a PrintModule pass just before Enzyme, solving this problem. Based on the PrintPass output, it also _seems_ like changing `registerEnzymeAndPassPipeline(PB, true);` to `registerEnzymeAndPassPipeline(PB, false);` has no effect. In theory, the bool should tell Enzyme to schedule some helpful passes in the PassBuilder. However, since it doesn't do anything and I'm not 100% sure anymore on whether we really need it, I'll just disable it for now and postpone investigations. r? ``@oli-obk`` closes #139471 Tracking: - https://github.com/rust-lang/rust/issues/124509	2025-04-24 17:19:44 +02:00
Manuel Drehwald	5ea9125f37	update documentation	2025-04-12 01:36:47 -04:00
Manuel Drehwald	31578dc587	fix "could not find source function" error by preventing function merging before AD	2025-04-12 01:36:47 -04:00
Manuel Drehwald	75f86e6e2e	fix LooseTypes flag and PrintMod behaviour, add debug helper	2025-04-12 01:36:44 -04:00
Michael Goulet	9c372d8940	Prepend temp files with a string per invocation of rustc	2025-04-07 20:48:40 +00:00
Michael Goulet	effef88ac7	Simplify temp path creation a bit	2025-04-07 20:48:40 +00:00
beetrees	3aac9a37a5	Remove LLVM 18 inline ASM span fallback	2025-04-06 02:31:52 +01:00
Stuart Cook	c6bf3a01ef	Rollup merge of #137880 - EnzymeAD:autodiff-batching, r=oli-obk Autodiff batching Enzyme supports batching, which is especially known from the ML side when training neural networks. There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights. That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations. Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size, and then each Dual/Duplicated argument has not one, but N shadow arguments. So instead of ```rs for i in 0..100 { df(x[i], y[i], 1234); } ``` You can now do ```rs for i in 0..100.step_by(4) { df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234); } ``` which will give the same results, but allows better compiler optimizations. See the testcase for details. There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days. I will also add more tests for both modes. For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU I'll also add some other docs to the dev guide and user docs in another PR. r? ghost Tracking: - https://github.com/rust-lang/rust/issues/124509 - https://github.com/rust-lang/rust/issues/135283	2025-04-05 13:18:13 +11:00
Manuel Drehwald	89d8948835	add new flag to print the module post-AD, before opts	2025-04-04 14:25:23 -04:00
Matthias Krüger	66e61c78e7	Rollup merge of #138949 - madsmtm:rename-to-darwin, r=WaffleLapkin Rename `is_like_osx` to `is_like_darwin` Replace `is_like_osx` with `is_like_darwin`, which more closely describes reality (OS X is the pre-2016 name for macOS, and is by now quite outdated; Darwin is the overall name for the OS underlying Apple's macOS, iOS, etc.). ``@rustbot`` label O-apple r? compiler	2025-04-04 08:02:05 +02:00
Mads Marquart	328846c6eb	Rename `is_like_osx` to `is_like_darwin`	2025-03-25 21:53:52 +01:00
Daniel Paoliello	79b9664091	Reduce visibility of most items in `rustc_codegen_llvm`	2025-03-25 16:36:47 +11:00
bors	0c72c0d11a	Auto merge of #133250 - DianQK:embed-bitcode-pgo, r=nikic The embedded bitcode should always be prepared for LTO/ThinLTO Fixes #115344. Fixes #117220. There are currently two methods for generating bitcode that used for LTO. One method involves using `-C linker-plugin-lto` to emit object files as bitcode, which is the typical setting used by cargo. The other method is through `-C embed-bitcode=yes`. When using with `-C embed-bitcode=yes -C lto=no`, we run a complete non-LTO LLVM pipeline to obtain bitcode, then the bitcode is used for LTO. We run the Call Graph Profile Pass twice on the same module. This PR is doing something similar to LLVM's `buildFatLTODefaultPipeline`, obtaining the bitcode for embedding after running `buildThinLTOPreLinkDefaultPipeline`. r? nikic	2025-03-01 08:22:18 +00:00
许杰友 Jieyou Xu (Joe)	61e90040db	Rollup merge of #137017 - bjorn3:ignore_invalid_bitcode, r=oli-obk Don't error when adding a staticlib with bitcode files compiled by newer LLVM cc https://github.com/rust-lang/rust/issues/128955#issuecomment-2657811196	2025-02-28 22:29:49 +08:00
bjorn3	9f190d764f	Restore usage of io::Error	2025-02-26 13:45:35 +00:00
David Wood	a5615d3c62	codegen_llvm: avoid `Deref` impls w/ extern type `rustc_codegen_llvm` relied on `Deref` impls where `Deref::Target` was or contained an extern type - in my experimental implementation of rust-lang/rfcs#3729, this isn't possible as the `Target` associated type's `?Sized` bound cannot be relaxed backwards compatibly (unless we come up with some way of doing this). In later pull requests with the rust-lang/rfcs#3729 implementation, breakage like this could only occur for nightly users relying on the `extern_types` feature. Upstreaming this to avoid needing to keep carrying this patch locally, and I think it'll necessarily need to change eventually.	2025-02-24 08:08:55 +00:00
DianQK	da50297a6e	Save pre-link bitcode to `ModuleCodegen`	2025-02-23 21:23:38 +08:00
DianQK	9431427cc3	Add `new_regular` and `new_allocator` to `ModuleCodegen`	2025-02-23 21:23:38 +08:00
DianQK	1a99ca8da9	The embedded bitcode should always be prepared for LTO/ThinLTO	2025-02-23 21:23:36 +08:00
Manuel Drehwald	e2d250c3f6	update autodiff flags	2025-02-21 21:51:20 -05:00
Manuel Drehwald	f4e2218b13	clean up autodiff code/comments	2025-02-21 21:47:48 -05:00
Oli Scherer	ce7f58bd91	Merge two operations that were always performed together	2025-02-20 11:24:00 +00:00
bjorn3	736ef0a4ce	Don't error when adding a staticlib with bitcode files compiled by newer LLVM	2025-02-14 10:54:21 +00:00
clubby789	2966256133	Make `-O` mean `-C opt-level=3`	2025-02-13 19:47:55 +00:00
Matthias Krüger	9e89feefb9	Rollup merge of #135549 - oli-obk:push-tmxtpnrloyqu, r=compiler-errors Document some safety constraints and use more safe wrappers Lots of unsafe codegen_llvm code has safe wrappers already, so I used some of them and added some where applicable. I stopped here because this diff is large enough and should probably be reviewed independently of other changes.	2025-02-12 06:07:35 +01:00
Oli Scherer	dcf1e4d72b	Document some safety constraints and use more safe wrappers	2025-02-11 09:47:13 +00:00
Oli Scherer	4b83038d63	Add a safe wrapper for `WriteBitcodeToFile`	2025-02-11 09:41:22 +00:00
Oli Scherer	b2cd1b8ead	Remove an unsafe closure invariant by inlining the closure wrapper into the called function	2025-02-11 09:41:22 +00:00
Jacob Pratt	6153a8dcea	Rollup merge of #136721 - dpaoliello:cleanllvm2, r=Zalathar cg_llvm: Reduce visibility of some items outside the `llvm` module Next piece of #135502 This reduces the visibility of items (other than those in the `llvm` module) so that dead code analysis will correctly identify unused items.	2025-02-11 01:02:40 -05:00
Daniel Paoliello	5f29273921	rustc_codegen_llvm: Mark items as pub(crate) outside of the llvm module	2025-02-10 10:17:25 -08:00
Manuel Drehwald	1221cff551	move second opt run to lto phase and cleanup code	2025-02-10 01:35:22 -05:00
Manuel Drehwald	21d096184e	fix non-enzyme builds	2025-02-07 22:27:46 -05:00
Ken Matsui	44e8c43976	rustc_codegen_llvm: remove outdated asm-to-obj codegen note Remove comment about missing integrated assembler handling, which was removed in commit 02840ca.	2025-01-22 17:58:50 -05:00
Matthias Krüger	448bad9eba	Rollup merge of #133752 - klensy:cp, r=davidtwco replace copypasted ModuleLlvm::parse replaced code same as in `bd36e69d25/compiler/rustc_codegen_llvm/src/lib.rs (L426-L445)` except before error message was emitted via `write::llvm_err`, which returned other error kind, but it still ok?	2025-01-13 15:56:55 +01:00
Matthew Maurer	fc32dd49cb	llvm: Ignore error value that is always false See llvm/llvm-project#121851 For LLVM 20+, this function (`renameModuleForThinLTO`) has no return value. For prior versions of LLVM, this never failed, but had a signature which allowed an error value people were handling.	2025-01-07 01:02:22 +00:00
Manuel Drehwald	d753cbf779	upstream rustc_codegen_llvm changes for enzyme/autodiff	2025-01-01 21:42:45 +01:00

1 2 3 4 5 ...

381 Commits