### What does this PR try to resolve?
This PR adds Cargo integration for the new unstable `-Zembed-metadata`
rustc flag, which was implemented in
https://github.com/rust-lang/rust/pull/137535 ([tracking
issue](https://github.com/rust-lang/rust/issues/139165)). The new
behavior has to be enabled explicitly using a new unstable CLI flag
`-Zno-embed-metadata`.
The `-Zembed-metadata=no` rustc flag can reduce disk usage of compiled
artifacts, and also the size of Rust dynamic library artifacts shipped
to users. However, it is not enough to just pass this flag through
`RUSTFLAGS`; it needs to be integrated within Cargo, because it
interacts with how the `--emit` flag is passed to rustc, and also how
`--extern` args are passed to the final linked artifact build by Cargo.
Furthermore, using the flag for all crates in a crate graph compiled by
Cargo would be suboptimal (this will all be described below).
When you pass `-Zembed-metadata=no` to rustc, it will not store Rust
metadata into the compiled artifact. This is important when compiling
libs/rlibs/dylibs, since it reduces their size on disk. However, this
also means that everytime we use this flag, we have to make sure that we
also:
- Include `metadata` in the `--emit` flag to generate a `.rmeta` file,
otherwise no metadata would be generated whatsoever, which would mean
that the artifact wouldn't be usable as a dependency.
- Pass also `--extern <dep>=<path>.rmeta` when compiling the final
linkable artifact. Before, Cargo would only pass `--extern
<dep>=<path>.[rlib|so|dll]`. Since with `-Zembed-metadata=no`, the
metadata is only in the `.rmeta` file and not in the rlib/dylib, this is
needed to help rustc find out where the metadata lies.
- Note: this essentially doubles the cmdline length when compiling the
final linked artifact. Not sure if that is a concern.
The two points above is what this PR implements, and why this rustc flag
needs Cargo integration.
The `-Zembed-metadata` flag is only passed to libs, rlibs and dylibs. It
does not seem to make sense for other crate types. The one situation
where it might make sense are proc macros, but according to @bjorn3 (who
initially came up with the idea for `-Zembed-metadata`, it isn't really
worth it).
Here is a table that summarizes the changes in passed flags and
generated files on disk for rlibs and dylibs:
| **Crate type** | **Flags** | **Generated files** | **Disk usage** |
|--|--|--|--|
| Rlib/Lib (before) | `--emit=dep-info,metadata,link` | `.rlib` (with
metadata), `.rmeta` (for pipelining) | - |
| Rlib/Lib (after) | `--emit=dep-info,metadata,link -Zembed-metadata=no`
| `.rlib` (without metadata), `.rmeta` (for metadata/pipelining) |
Reduced (metadata no longer duplicated) |
| Dylib (before) | `--emit=dep-info,link` | `[.so\|.dll]` (with
metadata) | - |
| Dylib (after) | `--emit=dep-info,metadata,link -Zembed-metadata=no` |
`[.so\|.dll]` (without metadata), `.rmeta` | Unchanged, but split
between two files |
Behavior for other target kinds/crate types should be unchanged.
From the table above, we can see two benefits of using
`-Zembed-metadata=no`:
- For rlibs/dylibs, we no longer store their metadata twice in the
target directory, thus reducing target directory size.
- For dylibs, we store esssentially the same amount of data on disk, but
the benefit is that the metadata is now in a separate .rmeta file. This
means that you can ship the dylib (`.so`/`.dll`) to users without also
shipping the metadata. This would slightly reduce e.g. the
[size](https://github.com/rust-lang/rust/pull/120855#issuecomment-1937018169)
of the shipped rustc toolchains (note that the size reduction here is
after the toolchain has been already heavily compressed).
Note that if this behavior ever becomes the default, it should be
possible to simplify the code quite a bit, and essentially merge the
`requires_upstream_objects` and `benefits_from_split_metadata`
functions.
I did a very simple initial benchmark to evaluate the space savings on
cargo itself and
[hyperqueue](https://github.com/It4innovations/hyperqueue) (a mid-size
crate from my work) using `cargo build` and `cargo build --release` with
and without `-Zembed-metadata=no`:

For debug/incremental builds, the effect is smaller, as the artifact
disk usage is dwarfed by incremental artifacts and debuginfo. But for
(non-incremental) release builds, the disk savings (and also performed
I/O operations) are significantly reduced.
### How should we test and review this PR?
I wrote two basic tests. The second one tests a situation where a crate
depends on a dylib dependency, which is quite rare, but the behavior of
this has actually changed in this PR (see comparison table above).
Testing this on various real-world projects (or even trying to enable it
by default across the whole Cargo suite?) might be beneficial.
## Unresolved questions
### Is this a breaking change?
With this new behavior, dylibs and rlibs will no longer contain
metadata. If they are compiled with Cargo, that shouldn't matter, but
other build systems might have to adapt.
### Should this become the default?
I think that in terms of disk size usage and performed I/O operations,
it is a pure win. It should either generate less disk data (for rlibs)
or the ~same amount of data for dylibs (the data will be a bit larger,
because the dylib will still contain a metadata stub header, but that's
like 50 bytes and doesn't scale with the size of the dylib, so it's
negligible).
So I think that eventually, we should just do this by default in Cargo,
unless some concerns are found. I suppose that before stabilizing we
should also benchmark the effect on compilation performance.
### What does this PR try to resolve?
- Added an error message when version in `CRATE[@<VER>]` or `--version
<VER>` starts with 'v' for `install`, `add`, `yank` and `update
--precise <VER>`
- Check if version is valid in `cargo yank`
Fixes#12331
### How should we test and review this PR?
Added tests for each subcommand
### What does this PR try to resolve?
Close https://github.com/rust-lang/cargo/issues/13527
As we discussed, `cargo fix` should use the default target selection
with `cargo check`.
In this PR, I modified `cargo fix` to no longer use all targets by
default. For `cargo fix --edition` and `cargo fix --edition-idioms`, it
will retain the old behavior and select all targets.
### How should we test and review this PR?
Unit tests
### Additional information
### What does this PR try to resolve?
Fixes#15436
### How should we test and review this PR?
There are 3 tests for each test case:
- there are no feature suggestions
- there's only one feature suggestion (most common)
- there are several feature suggestions
Suggest to user to use a crate name with an inserted @ before the first
invalid package name character
Fixes#15318
<!--
Thanks for submitting a pull request 🎉! Here are some tips for you:
* If this is your first contribution, read "Cargo Contribution Guide"
first:
https://doc.crates.io/contrib/
* Run `cargo fmt --all` to format your code changes.
* Small commits and pull requests are always preferable and easy to
review.
* If your idea is large and needs feedback from the community, read how:
https://doc.crates.io/contrib/process/#working-on-large-features
* Cargo takes care of compatibility. Read our design principles:
https://doc.crates.io/contrib/design.html
* When changing help text of cargo commands, follow the steps to
generate docs:
https://github.com/rust-lang/cargo/tree/master/src/doc#building-the-man-pages
* If your PR is not finished, set it as "draft" PR or add "WIP" in its
title.
* It's ok to use the CI resources to test your PR, but please don't
abuse them.
### What does this PR try to resolve?
Explain the motivation behind this change.
A clear overview along with an in-depth explanation are helpful.
You can use `Fixes #<issue number>` to associate this PR to an existing
issue.
### How should we test and review this PR?
Demonstrate how you test this change and guide reviewers through your
PR.
With a smooth review process, a pull request usually gets reviewed
quicker.
If you don't know how to write and run your tests, please read the
guide:
https://doc.crates.io/contrib/tests
### Additional information
Other information you want to mention in this PR, such as prior arts,
future extensions, an unresolved problem, or a TODO list.
-->
This proposes to stabilize automatic garbage collection of Cargo's
global cache data in the cargo home directory.
### What is being stabilized?
This PR stabilizes automatic garbage collection, which is triggered at
most once per day by default. This automatic gc will delete old, unused
files in cargo's home directory.
It will delete files that need to be downloaded from the network after 3
months, and files that can be generated without network access after 1
month. These thresholds are intended to balance the intent of reducing
cargo's disk usage versus deleting too often forcing cargo to do extra
work when files are missing.
Tracking of the last-use data is stored in a sqlite database in the
cargo home directory. Cargo updates timestamps in that database whenever
it accesses a file in the cache. This part is already stabilized.
This PR also stabilizes the `gc.auto.frequency` configuration option.
The primary use case for when a user may want to set that is to set it
to "never" to disable gc should the need arise to avoid it.
When gc is initiated, and there are files to delete, there will be a
progress bar while it is deleting them. The progress bar will disappear
when it finishes. If the user runs with `-v` verbose option, then cargo
will also display which files it deletes.
If there is an error while cleaning, cargo will only display a warning,
and otherwise continue.
### What is not being stabilized?
The manual garbage collection option (via `cargo clean gc`) is not
proposed to be stabilized at this time. That still needs some design
work. This is tracked in
https://github.com/rust-lang/cargo/issues/13060.
Additionally, there are several low-level config options currently
implemented which define the thresholds for when it will delete files. I
think these options are probably too low-level and specific. This is
tracked in https://github.com/rust-lang/cargo/issues/13061.
Garbage collection of build artifacts is not yet implemented, and
tracked in https://github.com/rust-lang/cargo/issues/13136.
### Background
This feature is tracked in
https://github.com/rust-lang/cargo/issues/12633 and was implemented in a
variety of PRs, primarily https://github.com/rust-lang/cargo/pull/12634.
The tests for this feature are located in
https://github.com/rust-lang/cargo/blob/master/tests/testsuite/global_cache_tracker.rs.
Cargo started tracking the last-use data on stable via
https://github.com/rust-lang/cargo/pull/13492 in 1.78 which was released
2024-05-02. This PR is proposing to stabilize automatic deletion in 1.82
which will be released in 2024-10-17.
### Risks
Users who frequently use versions of Rust older than 1.78 will not have
the last-use data tracking updated. If they infrequently use 1.78 or
newer, and use the same cache files, then the last-use tracking will
only be updated by the newer versions. If that time frame is more than 1
month (or 3 months for downloaded data), then cargo will delete files
that the older versions are still using. This means the next time they
run the older version, it will have to re-download or re-extract the
files.
The effects of deleting cache data in environments where cargo's cache
is modified by external tools is not fully known. For example, CI
caching systems may save and restore cargo's cache. Similarly, things
like Docker images that try to save the cache in a layer, or mount the
cache in a read-only filesystem may have undesirable interactions.
The once-a-day performance hit might be noticeable to some people. I've
been using this for several months, and almost never notice it. However,
slower systems, or situations where there is a lot of data to delete
might take a while (on the order of seconds hopefully).
This updates the flags used for doctest xcompile to match the upstream
changes in https://github.com/rust-lang/rust/pull/137096 which renamed
and stabilized the flags.
This cannot be merged until after nightly is published tonight.
### What does this PR try to resolve?
I recently depended on a package with `preserve-order` rather than
`preserve_order` and the error message didn't help me with the problem
so I figure I'd fix that. I also found other improvements along the way
- Suggest an alternative feature when a feature includes a missing
feature
- Suggest an alternative feature when a dependency includes a missing
feature
- Lower case error messages
- Re-frame prescriptive information as help
- Change plural "features" error messages to singular "feature" as they
can only ever have one (knowing an the `MissingFeature` string only has
one feature in it was important for doing a `closest` match on the
feature).
### How should we test and review this PR?
### Additional information
changes summary :
- change the `pkg_id` field of `struct SerializedUnit<'a>` to be `PackageIdSpec` instead of `PackageId`
- change the unit-graph testcases to match the changes
(cleaning previous commits so every commit passes CI checks, as required)
If a library exists both in an added folder inside OUT_DIR and in the
OS, prefer to use the one within OUT_DIR. Folders within OUT_DIR and
folders outside OUT_DIR do not change their relative order between
themselves.
This is accomplished by sorting by whether we think the path is inside
the search path or outside.
### What does this PR try to resolve?
Fixes#15220. If a Rust crates builds a dynamic library & that same
dynamic library is installed in the host OS, the result of the build's
success & consistent behavior of executed tools depends on whether or
not the user has the conflicting dynamic library in the external search
path. If they do, then the host OS library will always be used which is
unexpected - updates to your Rust dependency will still have you linking
& running against an old host OS library (i.e. someone who doesn't have
that library has a different silent behavior).
### How should we test and review this PR?
This is what I did to verify my issue got resolved but I'm sure there's
a simpler example one could construct.
* Make sure Alsa and libllama.so are installed (on Arch I installed
alsa-lib and llama.cpp-cuda).
* Clone llama-cpp-2 & init llama.cpp submodule & update the submodule to
point to https://github.com/ggml-org/llama.cpp/pull/11997 instead.
* Add plumbing to expose the new method within llama-cpp-2 as a public
facing function on the LlamaModel struct (it's basically the same code
as for n_head, just calling n_head_kv from llama.cpp).
* Add cpal as a dependency in crate "foo"
* Add llama-cpp-2 via path as a dependency in crate "foo" and enable the
`dynamic-link` feature.
* Add code using the newly expose n_head_kv method in crate "foo" in
main.rs. NOTE: Code just needs to compile & be exported, doesn't have to
be correct (fn main is probably easiest.
* Add some basic code that tries to initialize cpal in crate "foo" in fn
main.
* Try to build / run crate "foo"
Before my change, it fails with a linker error saying it can't find
`llama_model_n_head_kv` because /usr/lib appears in the search path
before the directory that contains the libllama.so that was built
internally by the crate. This is because cpal depends on alsa-sys which
uses pkg-config which adds /usr/lib to the search path before the
llama-cpp-sys-2 build.rs is run.
### Additional information
I'm not sure how to add tests so open to some help on that. I wanted to
make sure that this approach is even correct. I coded this to change
Cargo minimally and defensively since I don't know the internals of
Cargo very well (e.g. I don't know if I have to compare against both
`script_out_dir` / `script_out_dir_when_generated` since I don't know
the difference & there's not really any explanation on what they are).
It's possible this over-complicates the implementation so open to any
feedback. Additionally, the sort that happens prior to each build up of
the rustc environment is not where I'd ideally place it. I think it
would be more efficient to have the list of search paths be
free-floating and not tied to a BuildOutput so that they could be kept
updated live & resorted only on insertion (since it's changed less
frequently than rustc is invoked). Additionally, the generalized sort is
correct but pessimistic - maintaining the list sorted could be done
efficiently with some minor book keeping (i.e. you'd only need to sort
the new paths & then could quickly inject into the middle of a
VecDeque).
And of course in terms of correctness, I didn't do a thorough job
testing across all possible platforms. From first principles this seems
directionally correct but it's always possible this breaks someone
else's workflow. I'm also uneasy that the relative position of `-L` /
`-l` arguments changes in this PR & I'm not sure if that's observable
behavior or not (i.e. it used to be -L for a crate followed by `-l` for
a crate), but now it's `-L` for all crates, still grouped by crated
internally, followed by `-l` by crate).
### What does this PR try to resolve?
This reverts commit 71ea2e5c5fa285e8e0336d51fd03ba4a427154bf.
`Repository::discover` and `Repository::status_file` are too expenstive
to run inside a loop. And `cargo package` are doing a lot of duplicate
works for checking submodule VCS status.
Alternative fixes might look like
* Let `status_submodules` function returns a path entry set, so
Cargo can check whether a source file is dirty based on that.
* When listing files in `PathSource`, attach the VCS status of a
path entry assoicated with. Then subsequent operations can skip
status check entirely.
However, the above solutions are not trivial, and the dirtiness check is
informational only based on T-cargo conclusion, so we should be
good just reverting the change now.
Again, the caveat of this is that we can't really detect
dirty symlinks that link into a Git submodule.
### How should we test and review this PR?
Should be good to merge. We still got #15384 fixed via
d760263afb02c747a246bb0471a4f51e09075246
### Additional information
See
<https://github.com/rust-lang/cargo/issues/15384#issuecomment-2797064033>.
This reverts commit 71ea2e5c5fa285e8e0336d51fd03ba4a427154bf.
`Repository::discover` and `Repository::status_file` are too expenstive
to run inside a loop. And `cargo package` are doing a lot of duplicate
works for checking submodule VCS status.
The possible fix might look like
* Let `status_submodules` function returns a path entry set, so
Cargo can check whether a source file is dirty based on that.
* When listing files in `PathSource`, attach the VCS status of a
path entry assoicated with. Then subsequent operations can skip
status check entirely.
The above solutions are not trivial, and the dirtiness check is
informational only based on T-cargo conclusion, so we should be
good just reverting the change now.
Again, the caveat of this is that we can't really detect
dirty symlinks that links into a Git submodule.
Regardless of crate search paths emitted, always prefer searching search
paths pointing into the artifacts directory to those pointing outside.
This way libraries built by Cargo are preferred even if the same library
name exists in the system & a crate earlier in the build process emitted
a system library path for searching.