Since MSRV is bumped to 1.63, `Mutex::new` is now usable in const context.
Also use `assert!` in const function to ensure correctness instead of
silently truncating the value and remove cfg `tokio_no_const_mutex_new`.
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
This patch includes an initial implementation of a new multi-threaded
runtime. The new runtime aims to increase the scheduler throughput by
speeding up how it dispatches work to peer worker threads. This
implementation improves most benchmarks by about ~10% when the number of
threads is below 16. As threads increase, mutex contention deteriorates
performance.
Because the new scheduler is not yet ready to replace the old one, the
patch introduces it as an unstable runtime flavor with a warning that it
isn't production ready. Work to improve the scalability of the runtime
will most likely require more intrusive changes across Tokio, so I am
opting to merge with master to avoid larger conflicts.
There are a number of cases in which being able to identify a runtime is
useful.
When instrumenting an application, this is particularly true. For
example, we would like to be able to add traces for runtimes so that
tasks can be differentiated (#5792). It would also allow a way to
differentiate runtimes which are have their tasks dumped.
Outside of instrumentation, it may be useful to check whether 2 runtime
handles are pointing to the same runtime.
This change adds an opaque `runtime::Id` struct which serves this
purpose, initially behind the `tokio_unstable` cfg flag.
The inner value of the ID is taken from the `OwnedTasks` or
`LocalOwnedTasks` struct which every runtime and local set already
has. This will mean that any use of the ID will align with the task
dump feature.
The ID is added within the scope of working towards closing #5545.
The new mio_unsupported_force_poll_poll behaviour works the same as
Windows (using level-triggered APIs to mimic edge-triggered ones) and it
depends on intercepting an EAGAIN result to start polling the fd again.
We switch to using a `NonZeroU64` for the `id` field for `OwnedTasks`
and `LocalOwnedTasks` lists. This allows the task header to contain an
`Option<NonZeroU64>` instead of a `u64` with a special meaning for 0.
The size in memory will be the same thanks to Rust's niche optimization,
but this solution is clearer in its intent.
Co-authored-by: Alice Ryhl <aliceryhl@google.com>
- Pass `--no-deps` to `cargo-clippy`
- Use `dtolnay/rust-toolchain@stale` instead of
`dtolnay/rust-toolchain@master`
- Use dtolnay/rust-toolchain instead of `rustup` directly
- Use `cargo-nextest` in job test to speedup testing
- Use `cargo-nextest` in job test-unstable to speedup testing
- Use `cargo-nextest` in job test-unstable-taskdump to speedup testing
- Use `cargo-nextest` in job no-atomic-u64 to speedup testing
- Use `cargo-nextest` in job check-unstable-mt-counters
- Run `cargo check --benches` for `benches/` in job test
Since the benchmark is not run
- Run `cargo-check` instead of `cargo-build` in job test-parking_lot
since no test is run
- Run `cargo-check` instead of `cargo-build` in job no-atomic-u64
- Run `Swatinem/rust-cache@v2` after `taiki-e/install-action@v2` to
avoid caching pre-built artifacts downloaded by it.
- Use `Swatinem/rust-cache@v2` in job no-atomic-u64
- Add concurrenty group to cancel outdated CI
- Use `taiki-e/setup-cross-toolchain-action@v1` in job cross-test
instead of cross, so that we can use `cargo-nextest` to run tests in
parallel.
Also use `Swatinem/rust-cache@v2` to cache artifacts.
- Use `Swatinem/rust-cache@v2` in job cross-check to speedup ci.
- Fix job `cross-test`: Use `armv5te-unknown-linux-gnueabi` for no-atomic-u64
testing instead of `arm-unknown-linux-gnueabihf`, which actually has
atomic-u64
- Rm use of `cross` in job `cross-check`
Since it does not run any test, it does not need the `cross-rs`
toolchain as tokio does not use any external C/C++ library that require
`gcc`/`clang` to compile.
- Add more recognizable name for steps in job cross-test
- Split job `test` into `test-{tokio-full, workspace-all-features,
integration-tests-per-feature}`
- Split job `no-atomic-u64` into `no-atomic-u64-{test, check}`
- Parallelize job `features` by using matrix
- Split `cross-test` into `cross-test-{with, without}-parking_lot`
- Speedup job `cross-test-*` and `no-atomic-u64-test` by running
`cargo-test` with `-- --test-threads 1` since `qemu` userspace
emulation has problems running binaries with many threads.
- Speedup workflow `stress-test.yml` and job `valgrind` in workflow `ci.yml`
by passing `--fair-sched=yes` to `valgrind`.
- Speedup job `test-hyper`: Cache `./hyper/target`
instead of caching `./target`, which is non-existent.
- Set `RUST_TEST_THREADS=1` to make sure `libtest` only use one thread
so that qemu will be happy with the tests.
This is applied to `cross-test-with(out)-parking_lot, no-atomic-u64-test`.
- Apply suggestions from code review
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
This patch fixes a bug where nesting `block_in_place` with a `block_on`
between could lead to a panic. This happened because the nested
`block_in_place` would try to acquire a core on return when it should
not attempt to do so. The `block_on` between the two nested
`block_in_place` altered the thread-local state to lead to the incorrect
behavior.
The fix is for each call to `block_in_place` to track if it needs to try
to steal a core back.
Fixes#5239
This patch fixes a bug where nesting `block_in_place` with a `block_on`
between could lead to a panic. This happened because the nested
`block_in_place` would try to acquire a core on return when it should
not attempt to do so. The `block_on` between the two nested
`block_in_place` altered the thread-local state to lead to the incorrect
behavior.
The fix is for each call to `block_in_place` to track if it needs to try
to steal a core back.
Fixes#5239
This patch removes the custom slab in favor of regular allocations an
`Arc`. Originally, the slab was used to be able to pass indexes as
tokens to the I/O driver when registering I/O resources. However, this
has the downside of having a more expensive token lookup path. It also
pins a `ScheduledIo` to a specific I/O driver. Additionally, the slab is
approaching custom allocator territory.
We plan to explore migrating I/O resources between I/O drivers. As a
step towards that, we need to decouple `ScheduledIo` from the I/O
driver. To do this, the patch uses plain-old allocation to allocate the
`ScheduledIo` and we use the pointer as the token. To use the token, we
need to be very careful about releasing the `ScheduledIo`. We need to
make sure that the associated I/O handle is deregistered from the I/O
driver **and** there are no polls. The strategy in this PR is to let the
I/O driver do the final release between polls, but I expect this
strategy to evolve over time.
The `drain_filter` method on the internal `LinkedList` type passes a
`&mut` reference to the node type. However, the `LinkedList` is intended
to be used with nodes that are shared in other ways. For example
`task::Header` is accessible concurrently from multiple threads.
Currently, the only usage of `drain_filter` is in a case where `&mut`
access is safe, so this change is to help prevent future bugs and
tighten up the safety of internal utilities.