Fix potential deadlock in CacheState::lock (#15698)

This PR fixes a potential source of deadlock in the `CacheState::lock`
function
([here](84709f0850/src/cargo/util/cache_lock.rs (L411))),
as explained below.

I ran into this deadlock while testing Dylint. For example, in [this
GitHub
run](https://github.com/trailofbits/dylint/actions/runs/15570922048),
two jobs were killed after running for six hours. This fix seems to
resolve the deadlock (e.g., see [this
run](https://github.com/trailofbits/dylint/actions/runs/15822570315),
which [uses the
fix](https://github.com/trailofbits/dylint/actions/runs/15822570315/workflow#L119-L132)).

Until this fix (or a similar one) appears in `rustup`-installable Cargo,
is there an easy workaround?

---

A `CacheState` struct holds [two recursive
locks](84709f0850/src/cargo/util/cache_lock.rs (L347-L357)):
`mutate_lock` and `cache_lock`.

When `MutateExclusive` is passed to `CacheState::lock`, it tries to
acquire both locks. First, it tries to acquire `mutate_lock`, then it
tries to acquire `cache_lock`.

The problematic case is when it acquires the first, but not the second.

Note that if the second cannot be acquired because of an error, the
`mutate_lock` recursive lock is decremented:
84709f0850/src/cargo/util/cache_lock.rs (L412-L415)

However, if the second would simply block, `LockingResult::WouldBlock`
is returned.

`CacheState::lock` is called from two places. One of those locations is
in `CacheLocker::try_lock`:[^1]
84709f0850/src/cargo/util/cache_lock.rs (L502-L506)

Note that `CacheLocker::try_lock` creates a
[`CacheLock`](84709f0850/src/cargo/util/cache_lock.rs (L427-L430))
if and only if `LockingResult::LockAcquired` is returned.

Furthermore, when a `CacheLock` is dropped, it decrements both
`mutate_lock` and `cache_lock`:
84709f0850/src/cargo/util/cache_lock.rs (L443-L446)

A scan of `cache_lock.rs` shows that there are only three places[^2]
where `mutate_lock.decrement()` is called: the error location in
`CacheState::lock` (referenced above), and two places in
`CacheLock::drop`.

Thus, if `LockingResult::WouldBlock` is returned from
`CacheState::lock`, `mutate_lock` is never decremented.

[^1]: The other location is in `CacheLocker::lock`, which calls
`CacheState::lock` with `BlockingMode::Blocking`. For that reason,
`CacheLocker::lock` should not return `WouldBlock` when called from this
location.
[^2]:
84709f0850/src/cargo/util/cache_lock.rs (L413),
84709f0850/src/cargo/util/cache_lock.rs (L438),
and
84709f0850/src/cargo/util/cache_lock.rs (L445)
This commit is contained in:
Ed Page 2025-06-23 15:55:04 +00:00 committed by GitHub
commit 409fed7dc1
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -408,7 +408,10 @@ impl CacheState {
.lock_exclusive(gctx, DOWNLOAD_EXCLUSIVE_DESCR, blocking)
{
Ok(LockAcquired) => {}
Ok(WouldBlock) => return Ok(WouldBlock),
Ok(WouldBlock) => {
self.mutate_lock.decrement();
return Ok(WouldBlock);
}
Err(e) => {
self.mutate_lock.decrement();
return Err(e);