Currently, when reading a file from disk, we include several pieces of
data from the on-disk file, including the user and group names and IDs,
the device major and minor, the mode, and the timestamp. This means
that our archives differ between systems, sometimes in unhelpful ways.
In addition, most users probably did not intend to share information
about their user and group settings, operating system and disk type, and
umask. While these aren't huge privacy leaks, cargo doesn't use them
when extracting archives, so there's no value to including them.
Since using consistent data means that our archives are reproducible and
don't leak user data, both of which are desirable features, let's
canonicalize the header to strip out identifying information.
We set the user and group information to 0 and root, since that's the
only user that's typically consistent among Unix systems. Setting
these values doesn't create a security risk since tar can't change the
ownership of files when it's running as a normal unprivileged user.
Similarly, we set the device major and minor to 0. There is no useful
value here that's portable across systems, and it does not affect
extraction in any way.
We also set the timestamp to the same one that we use for generated
files. This is probably the biggest loss of relevant data, but
considering that cargo doesn't otherwise use it and honoring it makes
the archives unreproducible, we canonicalize it as well.
Finally, we canonicalize the mode of an item we're storing by looking at
the executable bit and using mode 755 if it's set and mode 644 if it's
not. We already use 644 as the default for generated files, and this is
the same algorithm that Git uses to determine whether a file should be
considered executable. The tests don't test this case because there's
no portable way to create executable files on Windows.
For projects supporting reproducible builds, it's possible to set the
timestamp used in artifacts by setting SOURCE_DATE_EPOCH to a decimal
Unix timestamp. This is helpful because it allows users to produce the
exact same artifact, regardless of when the project was built, and it
also means that services which generate crates from source can generate
a consistent crate without having store previously built artifacts.
For all these reasons, let's honor the SOURCE_DATE_EPOCH environment
variable if it's set and use the current timestamp if it's not.
For each entry in the tar archive, we generate a new timestamp.
Normally cargo will be fast enough that we get a consistent timestamp,
but that need not be the case. There's very little reason to produce
different timestamps for different files and it's slightly more
efficient not to need to make multiple queries, so let's instead
generate a single timestamp for all entries that we generate.
Fix publishing with optional dependencies.
In #8799, I neglected to update the `publish` code to use the correct features when generating the JSON to upload to the registry. The `Cargo.toml` file was correctly updated, but the JSON was not. This caused Cargo to send the implicit `dep:` feature syntax in the JSON blob, which crates.io rejects. The solution here is to use the original feature map before the implicit features have been added.
Minor typo in features.md
Just a very minor typo
(that said: these docs don't actually explain what a Feature group is before using the term, but that's for another issue)
Check if rust-src contains a vendor dir, and patch it in
This is the cargo side of https://github.com/rust-lang/wg-cargo-std-aware/issues/23
Note that this design naively assumes there is only one version of each package. It does not robustly verify this, and will presumably just cryptically fail to resolve dependencies.
See https://github.com/rust-lang/rust/pull/78790 for the other half of this change.
This prevents a deadlock where the message queue is filled with output
messages but not emptied as the job producing the messages runs on the
same thread as the message processing.
Use u32/64::to/from_le_bytes instead of bit fiddling
What it says on the tin; it's just something I've spotted when browsing through the code and decided to search for other occurrences of casting to/from bytes.
Avoid constructing an anyhow::Error when not necessary
`anyhow::Error` always captures a backtrace when created, which is expensive.
Split out of #8837
Skip extracting .cargo-ok files from packages
This is for #8816
I'll look into adding a unit test tomorrow, I'm still familiarising myself with the project.
This fixes the case where a package contained an empty .cargo-ok file
and was mounted in a read only file system. This lead to attempting to
download the package again, which failed due to write permissions.
Implement weak dependency features.
This adds the feature syntax `dep_name?/feat_name` with a `?` to only enable `feat_name` if the optional dependency `dep_name` is enabled through some other means. See `unstable.md` for documentation.
This only works with the new feature resolver. I don't think I understand the dependency resolver well enough to implement it there. It would require teaching it to defer activating a feature, but due to the backtracking nature, I don't really know how to accomplish that. I don't think it matters, the main drawback is that the dependency resolver will be slightly more constrained, but in practice I doubt it will ever matter.
Closes#3494
**Question**
* An alternate syntax I considered was `dep_name?feat_name` (without the slash), what do people think? For some reason the `?/` seems kinda awkward to me.
This consistently puts for_host next to PackageId, since the pair
PackageId/for_host is used everywhere together. Somehow it seems better
to me to consistently keep them close together.
Avoid some extra downloads with new feature resolver.
There are some edge cases with the new feature resolver where it can erroneously trigger a download of a package that is not needed. This is due to the call `is_proc_macro` which has to downloaded the manifest to check if it is a proc-macro. The main change here is to defer calling `is_proc_macro` until after dependencies have been filtered. It also avoids calling `is_proc_macro` if the new feature resolver is enabled, but `decouple_host_deps` and `ignore_inactive_targets` are disabled (such as with `-Z weak-dep-features`), in which case it doesn't matter if it is a proc-macro or not.
Fixes#8776
Normalize SourceID in `cargo metadata`.
The SourceID in `cargo metadata` can have different values, but they can be equivalent in Cargo. This results in different serialized forms, which prevents comparing the ID strings. In this particular case, `SourceKind::Git(GitReference::Branch("master"))` is equivalent to `SourceKind::Git(GitReference::DefaultBranch)`, but they serialize differently.
The reason these end up differently is because the `SourceId` for a `Package` is created from the `Dependency` declaration. But the `SourceId` in `Cargo.lock` comes from the deserialized file. If you have an explicit `branch = "master"` in the dependency, then versions prior to 1.47 would *not* include `?branch=master` in `Cargo.lock`. However, since 1.47, internally Cargo will use `GitReference::Branch("master")`.
Conversely, if you have a new `Cargo.lock` (with `?branch=master`), and then *remove* the explicit `branch="master"` from `Cargo.toml`, you'll end up with another mismatch in `cargo metadata`.
The solution here is to use the variant from the `Package` when serializing the resolver in `cargo metadata`. I chose this since the `Package` variant is displayed on other JSON messages (like artifact messages), and I think this is the only place that the resolver variants are exposed (other than `Cargo.lock` itself).
I'm not convinced that this was entirely intended, since there is [code to avoid this](6a38927551/src/cargo/core/resolver/encode.rs (L688-L695)), and at the time #8522 landed, I did not realize this would change the V2 lock format. However, it's probably too late to try to reverse that, and I don't think there are any other problems other than this `cargo metadata` inconsistency.
Fixes#8756
vendor: correct the path to cargo config
When running `cargo vendor`, users are prompted to add the configuration to their cargo config. Unfortunately, the path named is not correct, as it's lacking the correct suffix.
When running `cargo vendor`, users are prompted to add the configuration to their cargo config. Unfortunately, the path named is not correct, as it's lacking the correct suffix.
Make host_root return host.root(), not host.dest()
Also create host_dest function to let other callsites retain their old functionality. Fixes#8817, verified it works on the original problem reported in the `rust-gpu` repo.
I did two things here:
1) Rename `host_root` (which returns `self.host.dest()`) to be `host_dest`. This has three callsites. I did this to make it more clear that it returns dest, not root.
2) For the callsite that's relevant in #8817, I created a "new" `host_root` function (that returns `self.host.root()`). This means that the callsite that this PR is actually fixing doesn't show up in this diff :/ - but I thought it was more clear this way.
(Also copied the example path docs over from `layout.rs` to hopefully avoid this mistake again in the future)
I tried to look into if the other two callsites should actually be calling `host.root()` instead of `dest`, because I imagine the same mistake could have been made again, but it quickly grew out of my understanding (this is my first time in the cargo codebase). Feel free to let me know if they should also call `host.root()` too, and I can update them.
Thanks! (oh gosh I have no idea what I'm doing, I hope this is right)
r? `@alexcrichton`
List available packages if providing `--package` with an empty value
May resolves#8591
## How
First we need to take the responsibility of check command line arguments from claps. I've examine all 10 build commands and all of them call [`ArgMatchesExt::compile_options`](2f115a76e5/src/cargo/util/command_prelude.rs (L389-L395)) directly or indirectly. And `compile_options` [calls `check_optional_opts`](2f115a76e5/src/cargo/util/command_prelude.rs (L499-L501)) to check if target selection options given an empty value. So we can do the same logic there.
I've also add a error message for an edge case though that one would never trigger at this moment.
Add a future-compatibility warning on allowed feature name characters.
This adds a restriction on the valid syntax of a feature name. An warning is issued if a feature does not match the new validation, with the intent that it will be an error in the future.
The new restriction is:
* The first character must be a [Unicode XID start character](https://unicode.org/reports/tr31/) (most letters), a digit, or `_`.
* Subsequent characters must be a [Unicode XID continue character](https://unicode.org/reports/tr31/) (a digit, `_`, or most letters), `-`, or `+`.
The changes around passing in `config` to `Summary` can mostly be reverted when this is changed to an error.
I'm a little concerned that we don't have a mechanism to silence the warning. Should we add one?
New namespaced features implementation.
This is a new implementation for namespaced features (#5565). See the `unstable.md` docs for a description of the new behavior. This is intended to fix several issues with the existing design, and to make it backwards compatible so that it does not require an opt-in flag.
This also includes tangentially-related changes to the new feature resolver. The changes are:
* `crate_name/feat_name` syntax will now always enable the feature `crate_name`, even if it is an inactive optional dependency (such as a dependency for another platform). The intent here is to have a certain amount of consistency whereby "features" are always activated, but individual crates will still not be activated.
* `--all-features` will now enable features for inactive optional dependencies. This is more consistent with `--features foo` enabling the `foo` feature, even when the `foo` dep is not activated.
I'm still very conflicted on how that should work, but I think it is better from a simplicity/consistency perspective. I still think it may be confusing if someone has a `cfg(some_dep)` in their code, and `some_dep` isn't built, it will error. The intent is that `cfg(accessible(some_dep))` will be the preferred method in the future, or using platform `cfg` expression like `cfg(windows)` to match whatever is in Cargo.toml.
Closes#8044Closes#8046Closes#8047Closes#8316
## Questions
- For various reasons, I changed the way dependency conflict errors are collected. One slightly negative consequence is that it will raise an error for the first problem it detects (like a "missing feature"). Previously it would collect a list of all missing features and display all of them in the error message. Let me know if I should retain the old behavior. I think it will make the code more complicated and brittle, but it shouldn't be too hard (essentially `Requirements` will need to collect a list of errors, and then `resolve_features` would need to check if the list is non-empty, and then aggregate the errors).
- Should `cargo metadata` show the implicit features in the "features" table? Currently it does not, and I think that is probably best (it mirrors what is in `Cargo.toml`), but I could potentially see an argument to show how cargo sees the implicit features.