75 Commits

Author SHA1 Message Date
Taiki Endo
68b4ca9f55
ci: pin compiler version in miri tests (#2614) 2020-06-12 18:37:06 +09:00
xliiv
f2f30d4cf6
docs: replace method links with intra-links (#2540) 2020-05-30 20:18:01 +02:00
Lucio Franco
66fef4a9bc
Remove tokio-tls from master (#2497) 2020-05-06 17:30:01 -04:00
Carl Lerche
a78b1c65cc
rt: cleanup and simplify scheduler (scheduler v2.5) (#2273)
A refactor of the scheduler internals focusing on simplifying and
reducing unsafety. There are no fundamental logic changes.

* The state transitions of the core task component are refined and
reduced.
* `basic_scheduler` has most unsafety removed.
* `local_set` has most unsafety removed.
* `threaded_scheduler` limits most unsafety to its queue implementation.
2020-03-05 10:31:37 -08:00
Carl Lerche
71c47fabf4
chore: bump nightly version used in CI (#2178)
This requires fixing a few warnings.
2020-01-26 21:54:14 -08:00
Carl Lerche
38bff0adda
macros: fix #[tokio::main] without rt-core (#2139)
The Tokio runtime provides a "shell" runtime when `rt-core` is not
available. This shell runtime is enough to support `#[tokio::main`] and
`#[tokio::test].

A previous change disabled these two attr macros when `rt-core` was not
selected. This patch fixes this by re-enabling the `main` and `test`
attr macros without `rt-core` and adds some integration tests to prevent
future regressions.
2020-01-21 10:46:32 -08:00
Carl Lerche
e1b1e216c5
ci: bring back build tests (#1813)
This directory was deleted when `cargo hack` was introduced, however
there were some tests that were still useful (macro failure output).

Also, additional build tests will be added over time.
2019-11-22 14:38:49 -08:00
Taiki Endo
7cd63fb946 ci: use -Z avoid-dev-deps in features check instead of --no-dev-deps (#1812) 2019-11-22 14:13:18 -08:00
Carl Lerche
bf741fec35
ci: generate docs (#1810)
Check docs as part of CI. This should catch link errors.
2019-11-22 11:55:57 -08:00
Taiki Endo
66cbed3ce3 tls: enable test on CI (#1779) 2019-11-16 22:24:58 -08:00
Taiki Endo
c15e01a09b
chore: remove rust-toolchain and add minimum supported version check (#1748)
* remove rust-toolchain

* add minimum supported version check
2019-11-08 13:26:08 +09:00
Taiki Endo
64f2bf0072
chore: update CI config to test on stable (#1747) 2019-11-08 00:32:04 +09:00
Taiki Endo
02f7264008 chore: check each feature works properly (#1695)
It is hard to maintain features list manually, so use cargo-hack's
`--each-feature` flag. And cargo-hack provides a workaround for an issue
that dev-dependencies leaking into normal build (`--no-dev-deps` flag),
so removed own ci tool.

Also, compared to running tests on all features, there is not much
advantage in running tests on each feature, so only the default features
and all features are tested.
If the behavior changes depending on the feature, we need to test it as
another job in CI.
2019-10-31 21:09:32 -07:00
Carl Lerche
2b909d6805
sync: move into tokio crate (#1705)
A step towards collapsing Tokio sub crates into a single `tokio`
crate (#1318).

The sync implementation is now provided by the main `tokio` crate.
Functionality can be opted out of by using the various net related
feature flags.
2019-10-29 15:11:31 -07:00
Carl Lerche
c62ef2d232
executor: move into tokio crate (#1702)
A step towards collapsing Tokio sub crates into a single `tokio`
crate (#1318).

The executor implementation is now provided by the main `tokio` crate.
Functionality can be opted out of by using the various net related
feature flags.
2019-10-28 21:40:29 -07:00
Eliza Weisman
7eb264a0d0
net: replace RwLock<Slab> with a lock free slab (#1625)
## Motivation

The `tokio_net::driver` module currently stores the state associated
with scheduled IO resources in a `Slab` implementation from the `slab`
crate. Because inserting items into and removing items from `slab::Slab`
requires mutable access, the slab must be placed within a `RwLock`. This
has the potential to be a performance bottleneck especially in the context of
the work-stealing scheduler where tasks and the reactor are often located on
the same thread.

`tokio-net` currently reimplements the `ShardedRwLock` type from
`crossbeam` on top of `parking_lot`'s `RwLock` in an attempt to squeeze
as much performance as possible out of the read-write lock around the
slab. This introduces several dependencies that are not used elsewhere.

## Solution

This branch replaces the `RwLock<Slab>` with a lock-free sharded slab
implementation. 

The sharded slab is based on the concept of _free list sharding_
described by Leijen, Zorn, and de Moura in [_Mimalloc: Free List
Sharding in Action_][mimalloc], which describes the implementation of a
concurrent memory allocator. In this approach, the slab is sharded so
that each thread has its own thread-local list of slab _pages_. Objects
are always inserted into the local slab of the thread where the
insertion is performed. Therefore, the insert operation needs not be
synchronized.

However, since objects can be _removed_ from the slab by threads other
than the one on which they were inserted, removal operations can still
occur concurrently. Therefore, Leijen et al. introduce a concept of
_local_ and _global_ free lists. When an object is removed on the same
thread it was originally inserted on, it is placed on the local free
list; if it is removed on another thread, it goes on the global free
list for the heap of the thread from which it originated. To find a free
slot to insert into, the local free list is used first; if it is empty,
the entire global free list is popped onto the local free list. Since
the local free list is only ever accessed by the thread it belongs to,
it does not require synchronization at all, and because the global free
list is popped from infrequently, the cost of synchronization has a
reduced impact. A majority of insertions can occur without any
synchronization at all; and removals only require synchronization when
an object has left its parent thread.

The sharded slab was initially implemented in a separate crate (soon to
be released), vendored in-tree to decrease `tokio-net`'s dependencies.
Some code from the original implementation was removed or simplified,
since it is only necessary to support `tokio-net`'s use case, rather
than to provide a fully generic implementation.

[mimalloc]: https://www.microsoft.com/en-us/research/uploads/prod/2019/06/mimalloc-tr-v1.pdf

## Performance

These graphs were produced by out-of-tree `criterion` benchmarks of the
sharded slab implementation.


The first shows the results of a benchmark where an increasing number of
items are inserted and then removed into a slab concurrently by five
threads. It compares the performance of the sharded slab implementation
with a `RwLock<slab::Slab>`:

<img width="1124" alt="Screen Shot 2019-10-01 at 5 09 49 PM" src="https://user-images.githubusercontent.com/2796466/66078398-cd6c9f80-e516-11e9-9923-0ed6292e8498.png">

The second graph shows the results of a benchmark where an increasing
number of items are inserted and then removed by a _single_ thread. It
compares the performance of the sharded slab implementation with an
`RwLock<slab::Slab>` and a `mut slab::Slab`.

<img width="925" alt="Screen Shot 2019-10-01 at 5 13 45 PM" src="https://user-images.githubusercontent.com/2796466/66078469-f0974f00-e516-11e9-95b5-f65f0aa7e494.png">

Note that while the `mut slab::Slab` (i.e. no read-write lock) is
(unsurprisingly) faster than the sharded slab in the single-threaded
benchmark, the sharded slab outperforms the un-contended
`RwLock<slab::Slab>`. This case, where the lock is uncontended and only
accessed from a single thread, represents the best case for the current
use of `slab` in `tokio-net`, since the lock cannot be conditionally
removed in the single-threaded case.

These benchmarks demonstrate that, while the sharded approach introduces
a small constant-factor overhead, it offers significantly better
performance across concurrent accesses.

## Notes

This branch removes the following dependencies `tokio-net`:
- `parking_lot`
- `num_cpus`
- `crossbeam_util`
- `slab`

This branch adds the following dev-dependencies:
- `proptest`
- `loom`

Note that these dev dependencies were used to implement tests for the
sharded-slab crate out-of-tree, and were necessary in order to vendor
the existing tests. Alternatively, since the implementation is tested
externally, we _could_ remove these tests in order to avoid picking up
dev-dependencies. However, this means that we should try to ensure that
`tokio-net`'s vendored implementation doesn't diverge significantly from
upstream's, since it would be missing a majority of its tests.

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2019-10-28 11:30:45 -07:00
Carl Lerche
987ba7373c
io: move into tokio crate (#1691)
A step towards collapsing Tokio sub crates into a single `tokio`
crate (#1318).

The `io` implementation is now provided by the main `tokio` crate.
Functionality can be opted out of by using the various net related
feature flags.
2019-10-26 08:02:49 -07:00
Carl Lerche
227533d456
net: move into tokio crate (#1683)
A step towards collapsing Tokio sub crates into a single `tokio`
crate (#1318).

The `net` implementation is now provided by the main `tokio` crate.
Functionality can be opted out of by using the various net related
feature flags.
2019-10-25 12:50:15 -07:00
Carl Lerche
cfc15617a5
codec: move into tokio-util (#1675)
Related to #1318, Tokio APIs that are "less stable" are moved into a new
`tokio-util` crate. This crate will mirror `tokio` and provide
additional APIs that may require a greater rate of breaking changes.

As examples require `tokio-util`, they are moved into a separate
crate (`examples`). This has the added advantage of being able to avoid
example only dependencies in the `tokio` crate.
2019-10-22 10:13:49 -07:00
Carl Lerche
b8cee1a60a
timer: move tokio-timer into tokio crate (#1674)
A step towards collapsing Tokio sub crates into a single `tokio`
crate (#1318).

The `timer` implementation is now provided by the main `tokio` crate.
The `timer` functionality may still be excluded from the build by
skipping the `timer` feature flag.
2019-10-21 16:45:13 -07:00
Carl Lerche
978013a215
fs: move into tokio (#1672)
A step towards collapsing Tokio sub crates into a single `tokio`
crate (#1318).

The `fs` implementation is now provided by the main `tokio` crate. The
`fs` functionality may still be excluded from the build by skipping the
`fs` feature flag.
2019-10-21 15:49:00 -07:00
Carl Lerche
ed5a94eb2d
executor: rewrite the work-stealing thread pool (#1657)
This patch is a ground up rewrite of the existing work-stealing thread
pool. The goal is to reduce overhead while simplifying code when
possible.

At a high level, the following architectural changes were made:

- The local run queues were switched for bounded circle buffer queues.
- Reduce cross-thread synchronization.
- Refactor task constructs to use a single allocation and always include
  a join handle (#887).
- Simplify logic around putting workers to sleep and waking them up.

**Local run queues**

Move away from crossbeam's implementation of the Chase-Lev deque. This
implementation included unnecessary overhead as it supported
capabilities that are not needed for the work-stealing thread pool.
Instead, a fixed size circle buffer is used for the local queue. When
the local queue is full, half of the tasks contained in it are moved to
the global run queue.

**Reduce cross-thread synchronization**

This is done via many small improvements. Primarily, an upper bound is
placed on the number of concurrent stealers. Limiting the number of
stealers results in lower contention. Secondly, the rate at which
workers are notified and woken up is throttled. This also reduces
contention by preventing many threads from racing to steal work.

**Refactor task structure**

Now that Tokio is able to target a rust version that supports
`std::alloc` as well as `std::task`, the pool is able to optimize how
the task structure is laid out. Now, a single allocation per task is
required and a join handle is always provided enabling the spawner to
retrieve the result of the task (#887).

**Simplifying logic**

When possible, complexity is reduced in the implementation. This is done
by using locks and other simpler constructs in cold paths. The set of
sleeping workers is now represented as a `Mutex<VecDeque<usize>>`.
Instead of optimizing access to this structure, we reduce the amount the
pool must access this structure.

Secondly, we have (temporarily) removed `threadpool::blocking`. This
capability will come back later, but the original implementation was way
more complicated than necessary.

**Results**

The thread pool benchmarks have improved significantly:

Old thread pool:

```
test chained_spawn ... bench:   2,019,796 ns/iter (+/- 302,168)
test ping_pong     ... bench:   1,279,948 ns/iter (+/- 154,365)
test spawn_many    ... bench:  10,283,608 ns/iter (+/- 1,284,275)
test yield_many    ... bench:  21,450,748 ns/iter (+/- 1,201,337)
```

New thread pool:

```
test chained_spawn ... bench:     147,943 ns/iter (+/- 6,673)
test ping_pong     ... bench:     537,744 ns/iter (+/- 20,928)
test spawn_many    ... bench:   7,454,898 ns/iter (+/- 283,449)
test yield_many    ... bench:  16,771,113 ns/iter (+/- 733,424)
```

Real-world benchmarks improve significantly as well. This is testing the hyper hello
world server using: `wrk -t1 -c50 -d10`:

Old scheduler:

```
Running 10s test @ http://127.0.0.1:3000
  1 threads and 50 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   371.53us   99.05us   1.97ms   60.53%
    Req/Sec   114.61k     8.45k  133.85k    67.00%
  1139307 requests in 10.00s, 95.61MB read
Requests/sec: 113923.19
Transfer/sec:      9.56MB
```

New scheduler:

```
Running 10s test @ http://127.0.0.1:3000
  1 threads and 50 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   275.05us   69.81us   1.09ms   73.57%
    Req/Sec   153.17k    10.68k  171.51k    71.00%
  1522671 requests in 10.00s, 127.79MB read
Requests/sec: 152258.70
Transfer/sec:     12.78MB
```
2019-10-19 11:09:40 -07:00
Ivan Petkov
741bef8fe1
tokio: move signal and process reexports to crate root (#1643) 2019-10-11 11:00:39 -07:00
Taiki Endo
55caddb9ce chore: do not trigger CI on std-future branch (#1635) 2019-10-07 09:17:27 -07:00
Jon Gjengset
7c341f45e0 chore: move CI to beta (#1615) 2019-09-27 09:51:45 -07:00
Taiki Endo
3a55aba251
macros: add build tests for #[tokio::main] and #[tokio::test] (#1591) 2019-09-23 04:09:30 +09:00
Carl Lerche
815173f8e5
chore: rm tokio-buf (#1574)
The crate has not been updated and it does not seem like it is a good
path forward.
2019-09-19 12:11:21 -07:00
Taiki Endo
efb27731ad timer: use our own AtomicU64 on targets with target_has_atomic less than 64 (#1538) 2019-09-13 10:18:32 -07:00
Taiki Endo
a791f4a758 chore: bump to newer nightly (#1485) 2019-08-20 20:07:16 -07:00
Ivan Petkov
357df38861
process: move into the tokio-net crate (#1475) 2019-08-19 19:42:54 -07:00
Ivan Petkov
68d5fcb8d1 docs: fix all rustdoc warnings (#1474) 2019-08-18 14:38:54 -07:00
Carl Lerche
c187cd75b6 signal: move into tokio-net (#1463) 2019-08-17 13:43:55 -07:00
Carl Lerche
a83f5e4ba6
uds: move into tokio-net (#1462) 2019-08-16 14:42:05 -07:00
Carl Lerche
ba1829fd26
chore: rename ui-tests -> build-tests (#1460) 2019-08-16 09:26:56 -07:00
Carl Lerche
ce7e60e396
udp: move tokio-udp into tokio-net (#1459) 2019-08-16 07:26:10 -07:00
Carl Lerche
4788d3a9e3
tcp: move tokio-tcp into tokio-net (#1456) 2019-08-15 20:37:25 -07:00
Carl Lerche
3b27dc31d2
threadpool: move threadpool into tokio-executor (#1452)
The threadpool is behind a feature flag.

Refs: #1264
2019-08-15 13:09:02 -07:00
Carl Lerche
8538c25170
reactor: rename tokio-reactor -> tokio-net (#1450)
* reactor: rename tokio-reactor -> tokio-net

This is in preparation for #1264
2019-08-15 11:04:58 -07:00
Carl Lerche
9de7083be8
executor: move current-thread into crate (#1447)
The `CurrentThread` executor is exposed using a feature flag.

Refs: #1264
2019-08-15 09:52:25 -07:00
Taiki Endo
d9f9c5658f
chore: bump to newer nightly (#1426) 2019-08-11 02:01:20 +09:00
Taiki Endo
73102760cf
chore: change default lint level to warning and deny warnings in CI (#1416) 2019-08-10 00:07:57 +09:00
Carl Lerche
962521f449
chore: enable full CI run (#1399)
* update all tests
* fix doc examples
* misc API tweaks
2019-08-07 20:02:13 -07:00
Carl Lerche
2c01b3e0e0
io: remove util from default features (#1379)
Sub-crates should require opting into features.
2019-08-02 12:59:24 -07:00
andy finch
fbf90e6356 Update process to use std::future (#1343) 2019-07-29 18:36:11 -07:00
Taiki Endo
fe021e6c00
ci: enable clippy lints (#1335) 2019-07-26 03:47:14 +09:00
Taiki Endo
e88d10a3cb
chore: bump to newer nightly (#1338) 2019-07-22 06:04:02 +09:00
Jon Gjengset
e6cf976662 tokio: include async-traits feature (#1314)
The `tokio` facade crate will depend on the `async-traits` feature flag in
sub crates.
2019-07-15 14:02:14 -07:00
andy finch
795e02f4c6 fs: update to use std::future (#1269) 2019-07-11 09:05:49 -07:00
Carl Lerche
7ac8bfc821
chore: bump to newer nightly (#1284) 2019-07-10 14:36:36 -07:00
Yin Guanhao
80915906d8 uds: update to std-future (#1227) 2019-07-08 14:58:40 -07:00