This patch fixes a bug where nesting `block_in_place` with a `block_on`
between could lead to a panic. This happened because the nested
`block_in_place` would try to acquire a core on return when it should
not attempt to do so. The `block_on` between the two nested
`block_in_place` altered the thread-local state to lead to the incorrect
behavior.
The fix is for each call to `block_in_place` to track if it needs to try
to steal a core back.
Fixes#5239
This patch fixes a bug where nesting `block_in_place` with a `block_on`
between could lead to a panic. This happened because the nested
`block_in_place` would try to acquire a core on return when it should
not attempt to do so. The `block_on` between the two nested
`block_in_place` altered the thread-local state to lead to the incorrect
behavior.
The fix is for each call to `block_in_place` to track if it needs to try
to steal a core back.
Fixes#5239
This patch removes the custom slab in favor of regular allocations an
`Arc`. Originally, the slab was used to be able to pass indexes as
tokens to the I/O driver when registering I/O resources. However, this
has the downside of having a more expensive token lookup path. It also
pins a `ScheduledIo` to a specific I/O driver. Additionally, the slab is
approaching custom allocator territory.
We plan to explore migrating I/O resources between I/O drivers. As a
step towards that, we need to decouple `ScheduledIo` from the I/O
driver. To do this, the patch uses plain-old allocation to allocate the
`ScheduledIo` and we use the pointer as the token. To use the token, we
need to be very careful about releasing the `ScheduledIo`. We need to
make sure that the associated I/O handle is deregistered from the I/O
driver **and** there are no polls. The strategy in this PR is to let the
I/O driver do the final release between polls, but I expect this
strategy to evolve over time.
The `drain_filter` method on the internal `LinkedList` type passes a
`&mut` reference to the node type. However, the `LinkedList` is intended
to be used with nodes that are shared in other ways. For example
`task::Header` is accessible concurrently from multiple threads.
Currently, the only usage of `drain_filter` is in a case where `&mut`
access is safe, so this change is to help prevent future bugs and
tighten up the safety of internal utilities.
If there is an ongoing operation on a file, wait for that to complete
before performing the clone in `File::try_clone`. This avoids a race
between the ongoing operation and any subsequent operations performed on
the clone.
Fixes: #5759
Calling `Handle::enter()` returns a `EnterGuard` value, which resets the
thread-local context on drop. The drop implementation assumes that
guards from nested `enter()` calls are dropped in reverse order.
However, there is no static enforcement of this requirement.
This patch checks that the guards are dropped in reverse order and
panics otherwise. A future PR will deprecate `Handle::enter()` in favor
of a method that takes a closure, ensuring the guard is dropped
appropriately.
The tuning test relies on a predictable execution environment. It
assumes that spawning a new task can complete reasonably fast. When
running tests with ASAN, the tuning test will spurriously fail. After
investigating, I believe this is due to running tests with ASAN enabled
and without `release` in a low resource environment (CI) results in an
execution environment that is too slow for the tuning test to succeed.
This PR restructures `runtime::context` into multiple files by component and feature flag. The goal is to reduce code defined in macros and make each context component more manageable.
There should be no behavior changes except tweaking how the RNG seed is set. Instead of putting it in `set_current`, we set it when entering the runtime. This aligns better with the feature's original intent, enabling users to make a runtime's RNG deterministic. The seed should not be changed by `Handle::enter()`, so there is no need to have the code in `context::set_current`.
Removes `Send` from `EnterGuard` (returned by `Handle::enter()`. The
guard type changes a thread-local variable on drop. If the guard is
moved to a different thread, it would modify the wrong thread-local.
This is a **breaking change** but it fixes a bug and prevents incorrect
user behavior. If user code breaks because of this, it is because they
(most likely) have a bug in their code.
If the `Scoped` type is `Sync`, then you can call `set` from two threads in parallel. Since it accesses `inner` without synchronization, this is a data race.
This is a soundness issue for the `Scoped` type, but since this is an internal API and we don't use it incorrectly anywhere, no harm is done.
This commit is a step towards the ongoing effort to unify the mutex in
the multi-threaded scheduler. The Inject queue is split into two
structs. `Shared` holds fields that are concurrently accessed, and
`Synced` holds fields that must be locked to access. The multi-threaded
scheduler is responsible for locking `Synced` and passing it in when
needed.
The commit also splits `inject` into multiple files to help reduce the
amount of code defined in macros.
PR #5720 introduced runtime self-tuning. It included a test that
attempts to verify self-tuning logic. The test is heavily reliant on
timing details. This patch attempts to make the test a bit more reliable
by not assuming tuning will converge within a set amount of time.
Previously, `Inject` was defined in `runtime::task`. This was because it
used some internal fns as part of the intrusive linked-list
implementation.
In the future, we want to remove the mutex from Inject and move it to
the scheduler proper (to reduce mutex ops). To set this up, this commit
moves `Inject` to `runtime::scheduler`. To make this work, we have to
`pub(crate)` `task::RawTask` and use that as the interface to access the
next / previous pointers.
Previously, the deferred task list (list of tasks that yielded and are
waiting to be woken) was stored on the global runtime context. Because
the scheduler is responsible for waking these tasks, it took additional
TLS reads to perform the wake operation.
Instead, this commit moves the list of deferred tasks into the scheduler
context. This makes it easily accessible from the scheduler itself.
In order to reduce the number of mutex operations in the multi-threaded
scheduler hot path, we need to unify the various mutexes into a single
one. To start this work, this commit splits up `Idle` into `Idle` and
`Synced`. The `Synced` component is stored separately in the scheduler's
`Shared` structure.
Each multi-threaded runtime worker prioritizes pulling tasks off of its
local queue. Every so often, it checks the injection (global) queue for
work submitted there. Previously, "every so often," was a constant
"number of tasks polled" value. Tokio sets a default of 61, but allows
users to configure this value.
If workers are under load with tasks that are slow to poll, the
injection queue can be starved. To prevent starvation in this case, this
commit implements some basic self-tuning. The multi-threaded scheduler
tracks the mean task poll time using an exponentially-weighted moving
average. It then uses this value to pick an interval at which to check
the injection queue.
This commit is a first pass at adding self-tuning to the scheduler.
There are other values in the scheduler that could benefit from
self-tuning (e.g. the maintenance interval). Additionally, the
current-thread scheduler could also benfit from self-tuning. However, we
have reached the point where we should start investigating ways to unify
logic in both schedulers. Adding self-tuning to the current-thread
scheduler will be punted until after this unification.