This commit fixes several RustDoc links which were either broken
(without references) or had unresolvable references.
It looks like nightly RustDoc has recently gotten much smarter about
finding links that were unresolvable. These had always been broken, but
they used to silently 404. Now, the nightly build of RustDoc will emit a
warning, which we deny, causing the build to fail. This should fix CI
(as well as actually fixing the wrong links).
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
### Changed
- **filter**: `EnvFilter` directive selection now behaves correctly
(i.e. like `env_logger`) (#583)
### Fixed
- **filter**: Fixed `EnvFilter` incorrectly allowing less-specific
filter directives to enable events that are disabled by more-specific
filters (#583)
- **filter**: Multiple significant `EnvFilter` performance
improvements, especially when filtering events generated by `log`
records (#578, #583)
- **filter**: Replaced `BTreeMap` with `Vec` in `DirectiveSet`,
improving iteration performance significantly with typical numbers of
filter directives (#580)
A big thank-you to @samschlegel for lots of help with `EnvFilter`
performance tuning in this release!
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
## Motivation
The benchmarks added in #581 indicate that `tracing-subscriber`'s
`EnvFilter` has significant overhead when filtering events generated
from `log` records that is not present when filtering native `tracing`
events.
Additionally, the behavior of `EnvFilter` differs from the `env_logger`
crate it emulates: when a less specific filter is more verbose than a
more specific filter, the more verbose filter is used. For example:
```
RUST_LOG=debug,tokio_postgres=info,hyper::server::response=info
```
* With `env_logger` - treat all targets as DEBUG *except* for
`tokio_postgres` and `hyper::server::response`
* With `EnvFilter::from_default_env()` - treat all targets as DEBUG.
This then overrides the explicit setting of INFO for the other two,
so they are DEBUG as well.
(see #512)
## Solution
This branch makes some performance optimizations to filtering, to
improve performance with `log` events significantly. In particular, it
makes the following changes:
* Code was previously added to record the maximum level enabled by
the static and dynamic directive sets. This would allow a fast path
for skipping events that no directive will ever enable. However, this
currently is never actually checked, so we always have to check an
event against every directive. I've fixed that.
* If there are no dynamic directives that would require looking at the
dynamic span state (a TLS value), we now avoid looking at TLS.
* Finally, I've fixed#512, by changing `enabled` to select the _most_
specific static filter, rather than by checking _every_ filter until
it finds something that enables the span. In addition to making the
behaviour correct, this also improves performance: it means we only
iterate over the statics until we find something that cares about an
event, and then return. In the previous implementation (of incorrect
behavior), we would break the iteration if the event was enabled, but
never break iteration if it was disabled. Now, we can return early.
This should improve performance significantly with a large number of
filters.
## Benchmark Results
Performance in the `filter_log` benchmark is significantly improved in
most cases, while performance in the `filter` benchmark (for filtering
`tracing` events) is about the same as master, but significantly better
than `filter_log` across the board. This is expected, since the
`tracing` events in `filter` benchmark can (in most cases) participate
in the callsite cache, while the `log` events cannot. Therefore, the
`filter` benchmarks with static filters are measuring the overhead of a
single filter hit + multiple callsite cache loads, while the
`filter_log` benchmarks are always measuring an actual filter hit.
<details>
<summary>Benchmark results (vs master):</summary>
```console
eliza@ares:~/tracing$ cargo bench -p tracing-subscriber --bench filter --bench filter_log
Compiling tracing-subscriber v0.2.0 (/home/eliza/tracing/tracing-subscriber)
Finished bench [optimized] target(s) in 9.86s
Running target/release/deps/filter-b804f035b8022c0d
static/baseline_single_threaded
time: [109.11 ns 109.21 ns 109.32 ns]
change: [-6.3289% -4.9555% -3.6799%] (p = 0.00 < 0.05)
Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
2 (2.00%) high mild
9 (9.00%) high severe
static/single_threaded time: [62.942 ns 63.023 ns 63.150 ns]
change: [-0.1370% +0.0191% +0.2011%] (p = 0.83 > 0.05)
No change in performance detected.
Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) low mild
6 (6.00%) high mild
8 (8.00%) high severe
static/enabled_one time: [28.745 ns 28.764 ns 28.789 ns]
change: [-0.2144% -0.0862% +0.0283%] (p = 0.18 > 0.05)
No change in performance detected.
Found 13 outliers among 100 measurements (13.00%)
1 (1.00%) low mild
4 (4.00%) high mild
8 (8.00%) high severe
static/enabled_many time: [28.765 ns 28.827 ns 28.912 ns]
change: [-0.0864% +0.2092% +0.5574%] (p = 0.25 > 0.05)
No change in performance detected.
Found 16 outliers among 100 measurements (16.00%)
5 (5.00%) high mild
11 (11.00%) high severe
static/disabled_level_one
time: [4.1104 ns 4.1134 ns 4.1170 ns]
change: [+0.0402% +0.1317% +0.2300%] (p = 0.01 < 0.05)
Change within noise threshold.
Found 13 outliers among 100 measurements (13.00%)
5 (5.00%) high mild
8 (8.00%) high severe
static/disabled_level_many
time: [3.6496 ns 3.6530 ns 3.6573 ns]
change: [-0.2017% -0.0164% +0.1271%] (p = 0.86 > 0.05)
No change in performance detected.
Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) low mild
1 (1.00%) high mild
13 (13.00%) high severe
static/disabled_one time: [3.6496 ns 3.6517 ns 3.6543 ns]
change: [-0.1254% +0.1413% +0.5468%] (p = 0.52 > 0.05)
No change in performance detected.
Found 14 outliers among 100 measurements (14.00%)
1 (1.00%) low mild
1 (1.00%) high mild
12 (12.00%) high severe
static/disabled_many time: [3.6501 ns 3.6518 ns 3.6538 ns]
change: [-0.0351% +0.0331% +0.1054%] (p = 0.35 > 0.05)
No change in performance detected.
Found 15 outliers among 100 measurements (15.00%)
2 (2.00%) low mild
2 (2.00%) high mild
11 (11.00%) high severe
static/baseline_multithreaded
time: [7.6316 us 7.8421 us 8.0892 us]
change: [-1.1385% +2.9646% +7.3036%] (p = 0.18 > 0.05)
No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
5 (5.00%) high mild
2 (2.00%) high severe
static/multithreaded time: [7.5579 us 7.7286 us 7.9126 us]
change: [+0.2480% +3.6935% +7.1400%] (p = 0.04 < 0.05)
Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
dynamic/baseline_single_threaded
time: [245.73 ns 246.12 ns 246.67 ns]
change: [-2.6694% -2.3670% -2.0569%] (p = 0.00 < 0.05)
Performance has improved.
Found 20 outliers among 100 measurements (20.00%)
7 (7.00%) low mild
2 (2.00%) high mild
11 (11.00%) high severe
dynamic/single_threaded time: [1.1532 us 1.1550 us 1.1568 us]
change: [+0.2317% +0.4117% +0.5707%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
dynamic/baseline_multithreaded
time: [7.6121 us 7.7931 us 7.9882 us]
change: [-4.3847% +0.2323% +4.8431%] (p = 0.92 > 0.05)
No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severe
dynamic/multithreaded time: [8.0848 us 8.2926 us 8.5161 us]
change: [-2.0820% +1.4262% +4.9321%] (p = 0.42 > 0.05)
No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) low mild
4 (4.00%) high mild
mixed/disabled time: [67.544 ns 67.623 ns 67.716 ns]
change: [+4.0220% +4.3608% +4.6472%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
1 (1.00%) low mild
1 (1.00%) high mild
10 (10.00%) high severe
mixed/disabled_by_level time: [26.704 ns 26.743 ns 26.785 ns]
change: [-51.833% -51.737% -51.632%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severe
Running target/release/deps/filter_log-b8cedd42cccfa727
log/static/baseline_single_threaded
time: [459.87 ns 460.05 ns 460.28 ns]
change: [-4.4236% -4.3735% -4.3170%] (p = 0.00 < 0.05)
Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
1 (1.00%) low severe
4 (4.00%) low mild
1 (1.00%) high mild
7 (7.00%) high severe
log/static/single_threaded
time: [469.18 ns 470.12 ns 471.08 ns]
change: [-12.406% -12.037% -11.759%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
log/static/enabled_one time: [154.79 ns 155.08 ns 155.54 ns]
change: [-10.717% -10.585% -10.436%] (p = 0.00 < 0.05)
Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low mild
4 (4.00%) high mild
6 (6.00%) high severe
log/static/enabled_many time: [229.55 ns 229.69 ns 229.86 ns]
change: [-11.433% -11.076% -10.828%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
log/static/disabled_level_one
time: [57.489 ns 57.518 ns 57.557 ns]
change: [-20.363% -20.247% -20.155%] (p = 0.00 < 0.05)
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) high mild
5 (5.00%) high severe
log/static/disabled_level_many
time: [57.486 ns 57.515 ns 57.554 ns]
change: [-48.696% -48.652% -48.610%] (p = 0.00 < 0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) high mild
5 (5.00%) high severe
log/static/disabled_one time: [68.872 ns 68.902 ns 68.941 ns]
change: [-4.0195% -3.8865% -3.7487%] (p = 0.00 < 0.05)
Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
2 (2.00%) low mild
8 (8.00%) high severe
log/static/disabled_many
time: [103.10 ns 103.38 ns 103.72 ns]
change: [-5.8728% -5.6128% -5.3436%] (p = 0.00 < 0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) high mild
3 (3.00%) high severe
log/static/baseline_multithreaded
time: [7.5054 us 7.7234 us 7.9713 us]
change: [-4.8650% -1.5881% +2.0839%] (p = 0.37 > 0.05)
No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
log/static/multithreaded
time: [7.6285 us 7.7574 us 7.8967 us]
change: [-4.5998% -2.0115% +0.5929%] (p = 0.14 > 0.05)
No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
log/dynamic/baseline_single_threaded
time: [603.16 ns 603.55 ns 603.96 ns]
change: [-6.9548% -6.7858% -6.6651%] (p = 0.00 < 0.05)
Performance has improved.
Found 20 outliers among 100 measurements (20.00%)
3 (3.00%) high mild
17 (17.00%) high severe
log/dynamic/single_threaded
time: [1.4624 us 1.4645 us 1.4669 us]
change: [+4.2866% +4.4193% +4.5763%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) high mild
3 (3.00%) high severe
log/dynamic/baseline_multithreaded
time: [7.6292 us 7.8527 us 8.1103 us]
change: [-6.4268% -2.6258% +0.8495%] (p = 0.17 > 0.05)
No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
1 (1.00%) low mild
8 (8.00%) high mild
log/dynamic/multithreaded
time: [8.0253 us 8.2673 us 8.5647 us]
change: [-3.4243% +0.5838% +5.0749%] (p = 0.79 > 0.05)
No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) low mild
1 (1.00%) high mild
3 (3.00%) high severe
log/mixed/disabled time: [94.775 ns 94.825 ns 94.884 ns]
change: [-0.0623% +0.0665% +0.1997%] (p = 0.33 > 0.05)
No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
2 (2.00%) low mild
1 (1.00%) high mild
8 (8.00%) high severe
log/mixed/disabled_by_level
time: [59.763 ns 59.809 ns 59.873 ns]
change: [-26.225% -26.054% -25.852%] (p = 0.00 < 0.05)
Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
1 (1.00%) low mild
2 (2.00%) high mild
9 (9.00%) high severe
```
</details>
<details>
<summary>Benchmark environment:</summary>
```console
eliza@ares:~/tracing$ uname -a
Linux ares 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u2 (2019-11-11) x86_64 GNU/Linux
eliza@ares:~/tracing$ rustc --version
rustc 1.41.0 (5e1a79984 2020-01-27)
eliza@ares:~/tracing$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 40
On-line CPU(s) list: 0-39
Thread(s) per core: 2
Core(s) per socket: 10
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
Stepping: 1
CPU MHz: 1879.077
CPU max MHz: 2200.0000
CPU min MHz: 1200.0000
BogoMIPS: 4390.06
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 25600K
NUMA node0 CPU(s): 0-9,20-29
NUMA node1 CPU(s): 10-19,30-39
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb invpcid_single kaiser tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdseed adx smap intel_pt xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm arat pln pts
eliza@ares:~/tracing$
```
</details>
Fixes#512
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Quick glance at `perf` shows a lot of time being spent in
`btree_set::Iter::next()`. Since we only ever iterate over it after it's
built, we don't really need to pay this cost, so this switches to just
building a Vec.
Co-authored-by: Eliza Weisman <eliza@buoyant.io>
## Motivation
Currently, there are a few cases that aren't adequately covered by the
`tracing-subscriber` filtering benchmarks. In particular, we don't have
benchmarks which involve filters with multiple directives, missing a
large amount of potential overhead from iterating over the directive
set, and we don't have benchmarks for events generated by the `log`
integration. Since `log` events can't participate in the callsite
caching mechanism, their performance picture is radically different.
## Solution
This branch adds new benchmarks for filters with multiple static
directives, filters with multiple dynamic directives, and filters with
mixed static and dynamic directives. Additionally, I've added a new set
of benchmarks which are identical to the current ones, except that
`log`'s macros are used to emit events via `tracing-log`, rather than
using `tracing`'s macros directly.
## Current Results
<details>
<summary>Benchmark environment</summary>
```console
eliza@ares:~$ uname -a
Linux ares 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u2 (2019-11-11) x86_64 GNU/Linux
eliza@ares:~$ rustc --version
rustc 1.41.0 (5e1a79984 2020-01-27)
eliza@ares:~$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 40
On-line CPU(s) list: 0-39
Thread(s) per core: 2
Core(s) per socket: 10
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
Stepping: 1
CPU MHz: 2200.000
CPU max MHz: 2200.0000
CPU min MHz: 1200.0000
BogoMIPS: 4390.06
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 25600K
NUMA node0 CPU(s): 0-9,20-29
NUMA node1 CPU(s): 10-19,30-39
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb invpcid_single kaiser tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdseed adx smap intel_pt xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm arat pln pts
```
</details>
<details>
<summary>Benchmark results </summary>
```
Finished bench [optimized] target(s) in 50.95s
Running target/release/deps/filter-b804f035b8022c0d
static/baseline_single_threaded
time: [109.09 ns 109.37 ns 109.89 ns]
Found 14 outliers among 100 measurements (14.00%)
4 (4.00%) high mild
10 (10.00%) high severe
static/single_threaded time: [63.893 ns 63.951 ns 64.022 ns]
Found 14 outliers among 100 measurements (14.00%)
1 (1.00%) low mild
2 (2.00%) high mild
11 (11.00%) high severe
static/enabled_one time: [28.293 ns 28.321 ns 28.355 ns]
Found 9 outliers among 100 measurements (9.00%)
3 (3.00%) high mild
6 (6.00%) high severe
static/enabled_many time: [28.278 ns 28.292 ns 28.311 ns]
Found 15 outliers among 100 measurements (15.00%)
2 (2.00%) low mild
5 (5.00%) high mild
8 (8.00%) high severe
static/disabled_level_one
time: [3.6514 ns 3.6545 ns 3.6580 ns]
Found 15 outliers among 100 measurements (15.00%)
5 (5.00%) high mild
10 (10.00%) high severe
static/disabled_level_many
time: [4.1064 ns 4.1118 ns 4.1189 ns]
Found 13 outliers among 100 measurements (13.00%)
2 (2.00%) low mild
1 (1.00%) high mild
10 (10.00%) high severe
static/disabled_one time: [4.1063 ns 4.1307 ns 4.1673 ns]
Found 12 outliers among 100 measurements (12.00%)
1 (1.00%) low mild
11 (11.00%) high severe
static/disabled_many time: [4.1091 ns 4.1136 ns 4.1190 ns]
Found 13 outliers among 100 measurements (13.00%)
3 (3.00%) high mild
10 (10.00%) high severe
static/baseline_multithreaded
time: [7.2856 us 7.4135 us 7.5423 us]
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
static/multithreaded time: [7.5089 us 7.6710 us 7.8466 us]
Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severe
dynamic/baseline_single_threaded
time: [247.58 ns 247.78 ns 247.98 ns]
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
dynamic/single_threaded time: [1.2122 us 1.2131 us 1.2141 us]
Found 7 outliers among 100 measurements (7.00%)
3 (3.00%) high mild
4 (4.00%) high severe
dynamic/baseline_multithreaded
time: [7.3576 us 7.5858 us 7.8602 us]
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) low mild
2 (2.00%) high mild
2 (2.00%) high severe
dynamic/multithreaded time: [8.2427 us 8.4257 us 8.6156 us]
Found 5 outliers among 100 measurements (5.00%)
2 (2.00%) high mild
3 (3.00%) high severe
mixed/disabled time: [80.464 ns 80.534 ns 80.611 ns]
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) low mild
3 (3.00%) high mild
2 (2.00%) high severe
mixed/disabled_by_level time: [68.077 ns 68.113 ns 68.155 ns]
Found 5 outliers among 100 measurements (5.00%)
2 (2.00%) low severe
2 (2.00%) high mild
1 (1.00%) high severe
Running target/release/deps/filter_log-b8cedd42cccfa727
log/static/baseline_single_threaded
time: [477.71 ns 478.03 ns 478.40 ns]
Found 9 outliers among 100 measurements (9.00%)
7 (7.00%) high mild
2 (2.00%) high severe
log/static/single_threaded
time: [553.40 ns 554.14 ns 555.17 ns]
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) high mild
1 (1.00%) high severe
log/static/enabled_one time: [172.29 ns 172.47 ns 172.68 ns]
Found 21 outliers among 100 measurements (21.00%)
14 (14.00%) high mild
7 (7.00%) high severe
log/static/enabled_many time: [172.49 ns 172.61 ns 172.75 ns]
Found 10 outliers among 100 measurements (10.00%)
4 (4.00%) high mild
6 (6.00%) high severe
log/static/disabled_level_one
time: [78.422 ns 78.529 ns 78.651 ns]
Found 17 outliers among 100 measurements (17.00%)
2 (2.00%) high mild
15 (15.00%) high severe
log/static/disabled_level_many
time: [128.72 ns 128.83 ns 128.98 ns]
Found 8 outliers among 100 measurements (8.00%)
2 (2.00%) low mild
6 (6.00%) high severe
log/static/disabled_one time: [77.385 ns 77.579 ns 77.848 ns]
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) low mild
1 (1.00%) high mild
8 (8.00%) high severe
log/static/disabled_many
time: [123.67 ns 123.84 ns 124.03 ns]
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
log/static/baseline_multithreaded
time: [7.4644 us 7.6093 us 7.7570 us]
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
log/static/multithreaded
time: [7.5466 us 7.7412 us 7.9768 us]
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) low mild
2 (2.00%) high mild
log/dynamic/baseline_single_threaded
time: [615.96 ns 616.53 ns 617.29 ns]
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) high mild
5 (5.00%) high severe
log/dynamic/single_threaded
time: [1.4436 us 1.4457 us 1.4483 us]
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
log/dynamic/baseline_multithreaded
time: [7.3015 us 7.5285 us 7.8139 us]
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) low severe
3 (3.00%) high mild
2 (2.00%) high severe
log/dynamic/multithreaded
time: [7.9638 us 8.2182 us 8.5184 us]
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) low mild
2 (2.00%) high mild
3 (3.00%) high severe
log/mixed/disabled time: [104.93 ns 105.01 ns 105.12 ns]
Found 10 outliers among 100 measurements (10.00%)
3 (3.00%) low mild
7 (7.00%) high severe
log/mixed/disabled_by_level
time: [91.754 ns 92.322 ns 93.531 ns]
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low mild
2 (2.00%) high severe
```
</details>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
## Motivation
Currently, `tracing-subscriber`'s `EnvFilter` type uses a user-space
thread local for storing any entered spans which are relevant to
filters. This was intended to support allowing multiple `EnvFilter`s to
coexist in the same subscriber, which was never implemented. It turns
out this has a noticeable performance impact versus a proper
thread-local.
## Solution
I replaced the user-space thread local type with a proper thread-local.
## Benchmark Results
(vs master, note that there was some noise in the benchmark environment,
as evidenced by the "regression" in the baseline tests)
```
Running target/release/deps/filter-56ce39a1aef6b2a0
static/baseline_single_threaded
time: [98.622 ns 101.54 ns 104.78 ns]
change: [+2.5820% +6.1667% +9.7976%] (p = 0.00 < 0.05)
Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
6 (6.00%) high mild
1 (1.00%) high severe
static/single_threaded time: [50.699 ns 51.776 ns 52.896 ns]
change: [-2.8933% -1.4141% +0.1262%] (p = 0.07 > 0.05)
No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
5 (5.00%) high mild
2 (2.00%) high severe
static/baseline_multithreaded
time: [9.5372 us 9.7739 us 10.025 us]
change: [-22.208% -16.587% -10.758%] (p = 0.00 < 0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) high mild
3 (3.00%) high severe
static/multithreaded time: [9.3605 us 9.6779 us 10.031 us]
change: [-31.493% -26.773% -21.761%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
dynamic/baseline_single_threaded
time: [165.50 ns 166.57 ns 167.68 ns]
change: [-10.102% -8.6136% -7.1789%] (p = 0.00 < 0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild
dynamic/single_threaded time: [539.01 ns 542.44 ns 546.11 ns]
change: [-41.078% -40.104% -39.195%] (p = 0.00 < 0.05)
Performance has improved.
dynamic/baseline_multithreaded
time: [9.3314 us 9.7205 us 10.222 us]
change: [-45.458% -41.873% -37.948%] (p = 0.00 < 0.05)
Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
4 (4.00%) high mild
5 (5.00%) high severe
dynamic/multithreaded time: [10.095 us 10.295 us 10.520 us]
change: [-66.323% -63.666% -61.130%] (p = 0.00 < 0.05)
Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low mild
7 (7.00%) high mild
3 (3.00%) high severe
```
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
## Motivation
Intra-workspace dependencies can go out of sync if only referencing
crates.io, where changing one crate locally and testing the whole
library will result in errors stemming from two different versions of
the same trait etc. This is hard to keep track of and makes development
more difficult than it needs to be.
## Solution
Afaik it's common practice to specify both a version and a path as per
[the cargo docs][1]. This is what tokio does with its subcrates too.
[1]: https://github.com/rust-lang/cargo/blob/master/src/doc/src/reference/specifying-dependencies.md#multiple-locations.
# 0.2.0 (February 4, 2020)
### Breaking Changes
- **fmt**: Renamed `Context` to `FmtContext` (#420, #425)
- **fmt**: Renamed `Builder` to `SubscriberBuilder` (#420)
- **filter**: Removed `Filter`. Use `EnvFilter` instead (#434)
### Added
- **registry**: `Registry`, a `Subscriber` implementation that `Layer`s
can use as a high-performance, in-memory span store. (#420, #425,
#432, #433, #435)
- **registry**: Added `LookupSpan` trait, implemented by `Subscriber`s
to expose stored span data to `Layer`s (#420)
- **fmt**: Added `fmt::Layer`, to allow composing log formatting with
other `Layer`s
- **fmt**: Added support for JSON field and event formatting (#377,
#415)
- **filter**: Documentation for filtering directives (#554)
### Changed
- **fmt**: Renamed `Context` to `FmtContext` (#420, #425) (BREAKING)
- **fmt**: Renamed `Builder` to `SubscriberBuilder` (#420) (BREAKING)
- **fmt**: Reimplemented `fmt::Subscriber` in terms of the `Registry`
and `Layer`s (#420)
### Removed
- **filter**: Removed `Filter`. Use `EnvFilter` instead (#434)
(BREAKING)
### Fixed
- **fmt**: Fixed memory leaks in the slab used to store per-span data
(3c35048)
- **fmt**: `fmt::SubscriberBuilder::init` not setting up `log`
compatibility (#489)
- **fmt**: Spans closed by a child span closing not also closing
_their_ parents (#514)
- **Layer**: Fixed `Layered` subscribers failing to downcast to their
own type (#549)
- **Layer**: Fixed `Layer::downcast_ref` returning invalid references
(#454)
Fixes#515Fixes#458
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Depends on #564
This commit adds a new section to the `Layer` docs on how `Layer`s and
`Subscriber`s are composed. Additionally, it updates the docs to prefer
the use of `SubscriberExt::with` over `Layer::with_subscriber`.
Additionally, I've fixed how `SubscriberExt::with` is implemented, so
that `Layer`s may use `with_subscriber` as a callback for composition.
Closes #452Closes#505 (IMO)
## Motivation
The `LookupMetadata` trait was essentially a prototype for `LookupSpan`,
and was intended to allow subscribers to implement more granular sets of
"registry" behavior. However, in practice, nothing ended up using it and
it was generally eclipsed by `LookupSpan`. The model of allowing
`Layer` implementations to opt in to more granular capabilities doesn't
make a whole lot of sense when there are only two such traits available
and one is a more powerful superset of the other. Before we release
`tracing-subscriber` 0.2.0, we should try to remove any APIs that aren't
necessary, since removing them later is a breaking change.
## Solution
Therefore, this commit removes `LookupMetadata`, and rewrites the
`Context::metadata` method to use `LookupSpan` to look up the metadata,
instead.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
## Motivation
Currently, both the `Registry` type _and_ the `LookupSpan` and
`SpanData` traits require the "registry" feature flag. This means that
if a user wishes to implement their own custom registry type that
implements `LookupSpan`, they must enable the "registry" feature flag,
which also enables the `Registry` type and its dependencies.
## Solution
This commit changes the feature flag so that only the `Registry` and
`Data` types are conditional on the "registry" feature. The `LookupSpan`
and `SpanData` traits are now always enabled, as they are not much code
& don't require any additional internal dependencies (although if anyone
thinks it's important for them to have their own separate feature flag,
I am willing to change that).
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
This commit replaces uses of `Formatter::pad` with
`Formatter::write_char` wherever a single character is written to a
formatter. This prevents us from inadvertantly padding these characters
incorrectly, and may avoid some overhead.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Previously, when `tracing-subscriber`'s `fmt` module printed span
contexts, curly braces were only added around a span's fields. When we
replaced the previous implementation with the new `fmt::Layer`, this
behavior was lost, and empty curly braces were added to spans without
fields.
This commit fixes this by putting back the `if` conditional that used to
guard the printing of span fields.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Currently, the `tracing-subscriber` crate's `fmt` module has two
separate `fmt::Display` impls for its' `FmtCtx` and `FullCtx` types,
depending on whether the feature flag that enables ANSI colors is set.
Since a lot of the `fmt::Display` implementations doesn't have anything
to do with ANSI colors, this means that we have a lot of repeated code,
and in order to keep the formatting consistent, we have to keep these
implementations in sync manually. There are currently some
inconsistencies in formatting with ANSI colors on and off, implying that
we have failed to do that.
This commit simplifies the situation significantly by consolidating the
`fmt::Display` impls into one implementation, and only feature flagging
the code that actually needs to be different. We do this by introducing
a `Style` type which exposes no-op versions of methods with the same
names as the `ansi-term` crate's `Style` when ANSI formatting is
disabled. Now, the conditional logic is hidden in the function that
returns a `Style`, and the rest of the `fmt::Display` implementations
can be agnostic of ANSI colors. In release mode, `rustc` should be able
to optimize out the no-op methods and empty struct entirely.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Currently, when ANSI color formatting is enabled, `tracing-subscriber`'s
`fmt` module left-pads the INFO and WARN levels so that log lines are
aligned. However, when ANSI colors are off, these levels are not padded.
I've fixed this by replacing the string literals in the ANSI and
not-ANSI implementations with constants, to ensure the same string is
used regardless.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
### Added:
- **env_filter**: Documentation for filtering directives (#554)
- **registry**, **env_filter**: Updated `smallvec` dependency to 0.1
(#543)
### Fixed:
- **registry**: Fixed a memory leak in the slab used to store per-span
data (3c35048)
- **Layer**: Fixed `Layered` subscribers failing to downcast to their
own type (#549)
- **fmt**: Fixed a panic when multiple layers insert `FormattedFields`
extensions from the same formatter type (1c3bb70)
- **fmt**: Fixed `fmt::Layer::on_record` inserting a new
`FormattedFields` when formatted fields for a span already exist
(1c3bb70)
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Version 0.0.7 of `sharded-slab` contains a bug where, when the `remove`
method is called with the index of a slot that is not being accessed
concurrently, the slot is emptied but **not** placed on the free list.
This issue meant that, under `tracing-subscriber`'s usage pattern, where
slab entries are almost always uncontended when reused, allocated slab
pages are almost never reused, resulting in unbounded slab growth over
time (i.e. a memory leak).
This commit updates `tracing-subscriber`'s `sharded-slab` dependency to
v0.0.8, which includes commit hawkw/sharded-slab@dfdd7ae. That commit
fixes this bug.
I've empirically verified that, after running `linkerd2-proxy` under
load with a global `trace` filter that enables a *lot* of spans, heap
usage remains stable, and the characteristic stair-step heap growth
pattern of doubling slab allocations doesn't occur. This indicates that
freed slots are actually being reused, and (once fully warmed up), the
slab will only grow when =the number of active spans in the system
increases.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
This PR is extracted from #368 and introduces additional documentation for
the filter directives in tracing-subscriber.
Signed-off-by: David Barsky <me@davidbarsky.com>
Co-authored-by: Lucio Franco <luciofranco14@gmail.com>
Co-authored-by: Eliza Weisman <eliza@buoyant.io>
This fixes two bugs. One where the formatted fields were incorrectly
parameterized over Self instead of the formatter that generated the
formatted fields. The second bug is to check if another layer has
already inserted formatted fields and to avoid inserting if so, which
would cause a panic.
Currently, a `Layered` struct's implementations of `Subscriber` and
`Layer` will successfully downcast to either the the `Layer` type or the
inner type, but *not* to `Layered<Layer, Inner>`. This means that when a
`Layer` tries to downcast the wrapped subscriber to a known type, and it
is a `Layered` (so, any time more than one layer wraps a subscriber and
a layer other than the first one tries to downcast to the inner type it
wraps), the downcast will fail incorrectly.
This commit fixes the issue by checking if the downcast target `TypeId`
equals the `Layered` type's `TypeId`, _before_ trying to downcast to the
layer itself or to the inner type.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
# Motivation
The smallvec 0.6 dependency is causing some duplicate dependencies in
quinn. smallvec 1.0 has been released in November, it would be nice not
to have to build it twice.
## Solution
Upgrade to the 1.0 version.
These were added while debugging issue #511. They were never intended to
be committed, but apparently I missed a few while removing them, and
they slipped into PR #514 by mistake.
Once this merges, I'll publish `tracing-subscriber` 0.2.0-alpha.4. We may
want to consider yanking alpha.3.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* subscriber: prepare to release 0.2.0-alpha.3
### Added
- **fmt**: Public `FormattedFields::new` constructor (#478)
- **fmt**: Added examples to `fmt::Layer` documentation (#510)
- Documentation now shows what feature flags are required by each API item (#525)
### Fixed
- **fmt**: Missing space between timestamp and level (#480)
- **fmt**: Incorrect formatting with `with_target(false)` (#481)
- **fmt**: `fmt::SubscriberBuilder::init` not setting up `log` compatibility
(#489)
- **registry**: Spans exited out of order not being closed properly on exit
(#509)
- **registry**: Memory leak when spans are closed by a child span closing (#514)
- **registry**: Spans closed by a child span closing not also closing _their_
parents (#514)
- Compilation errors with `no-default-features` (#499, #500)
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* Get the last bits of tracing-core
* remove unnecessary doc_cfgs
* add doc_cfg for std feature in tracing-futures
* Finish enabling doc_cfg for tracing-futures
* fix tracing-futures doc_cf and start subscriber
* finish doc_cfg for tracing-subscriber
Authored-by: Jane Lusby <jlusby42@gmail.com>
This branch updates the `tracing-subscriber` tests for
span closure and removal from the registry. The test layer
used to make assertions about close orders is made more
flexible, so we can write more tests in the future, and the
tests are now a bit simpler.
This branch fixes a couple of issues which might result in child spans
not correctly triggering their parents to close:
* subscriber: fix spans closed by their children not being removed
The issue here is that the `CloseGuard` removes the span from the
registry (dropping the `DataInner`) *before* it decrements CLOSE_COUNT.
Dropping the `DataInner` will decrement the parent span's ref count,
potentially allowing it to close. When the new close for the parent span
starts, CLOSE_COUNT will be 1, not 0, and it will not be decremented
until after the try_close call completes. This means that although the
close is processed and all the layers see `on_close` calls for that
span, the span's data is never removed from the registry, effectively
leaking it.
This commit fixes the problem by decrementing CLOSE_COUNT *before* we
trigger the parent to close.
Fixes#511
* subscriber: fix child spans not closing grandparents
Currently, in situations where multiple parent spans are kept open by a
child, only the first parent will be successfully closed. This is
because calling `dispatcher::get_default` unsets the default dispatcher
temporarily, so that nested `get_default` calls will receive a no-op
subscriber. This was intended to prevent infinite dispatch loops when a
subscriber calls into code that emits its own traces. Unfortunately,
this means that if dropping the parent span would then trigger an
additional span to drop, the dispatcher is now unset.
This commit fixes this by cloning the dispatcher out of the
`get_default` closure _before_ trying to close the parent span. This
means that the `get_default` calls are no longer nested. This does have
the disadvantage of introducing an additional `Arc` increment/decrement
to the span-closing path. The impact of that shouldn't be too big, but
if necessary we can try to optimize this in the future.
I've also added tests for these issues.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* Add examples for layers
* Reorganize and shrink examples
* Apply suggestions from code review
Co-Authored-By: Eliza Weisman <eliza@buoyant.io>
* Clean up imports
* remove module specific filtering
* add an explanation
Co-authored-by: Eliza Weisman <eliza@buoyant.io>
Closes#430
This is a follow up PR for #477 that runs clippy on CI.
* Switch from hecrj/setup-rust-action to actions-rs/toolchain
* Fix remaining clippy lints
* Switch to minimal rustup profile
* Replace warning step with clippy
* Remove #![feature(async_await)]
* subscriber: traverse SpanStack on pop
In cases where spans are not popped in FILO (first-in-last-out) order,
`pop` would return `None`. This would cause issues in `Registry` where
spans that are exited out of order would not be closed properly.
Fixes#495
* PR comments
* Import super deps explicitly
* Use ClosingLayer in test
* s/pub(crate) mod tests/mod tests
* subscriber: compilation errors w/ nodefault + ansi
when compiling with -no-default-features --features "ansi", the
following compilation error would come up:
```
error[E0599]: no method named `with_env_filter` found for type `fmt::SubscriberBuilder` in the current scope
--> tracing-subscriber/src/fmt/mod.rs:663:10
|
146 | / pub struct SubscriberBuilder<
147 | | N = format::DefaultFields,
148 | | E = format::Format<format::Full>,
149 | | F = LevelFilter,
... |
153 | | inner: LayerBuilder<Registry, N, E, W>,
154 | | }
| |_- method `with_env_filter` not found for this
...
663 | .with_env_filter(crate::EnvFilter::from_default_env())
| ^^^^^^^^^^^^^^^ method not found in `fmt::SubscriberBuilder`
```
* subscriber: compilation errors w/ nodefault + fmt
when compiling with -no-default-features --features "fmt", the following
compilation error would come up:
```
error[E0107]: wrong number of type arguments: expected 2, found 1
--> tracing-subscriber/src/fmt/format/mod.rs:603:30
|
603 | impl<'a, N> fmt::Display for FullCtx<'a, N>
| ^^^^^^^^^^^^^^ expected 2 type arguments
```
* subscriber: compilation errors w/ nodefault + env-filter + fmt
when compiling with -no-default-features --features "env-filter fmt", the
following compilation error would come up:
```
error[E0277]: the trait bound `for<'lookup> <S as registry::LookupSpan<'_>>::Data: registry::LookupSpan<'lookup>` is not satisfied
--> tracing-subscriber/src/fmt/format/mod.rs:304:44
|
279 | ) -> fmt::Result {
| - help: consider further restricting the associated type: `where for<'lookup> <S as registry::LookupSpan<'_>>::Data: registry::LookupSpan<'lookup>`
...
304 | write!(writer, "{} {}", fmt_level, fmt_ctx)?;
| ^^^^^^^ the trait `for<'lookup> registry::LookupSpan<'lookup>` is not implemented for `<S as registry::LookupSpan<'_>>::Data`
|
= note: required because of the requirements on the impl of `std::fmt::Display` for `fmt::format::FmtCtx<'_, S, N>`
= note: required by `std::fmt::Display::fmt`
error[E0593]: closure is expected to take 1 argument, but it takes 2 arguments
--> tracing-subscriber/src/fmt/format/mod.rs:530:18
|
530 | self.ctx.visit_spans(|_, span| {
| ^^^^^^^^^^^ --------- takes 2 arguments
| |
| expected closure that takes 1 argument
```
Closes#494
This fixes the following compilation error:
```
error[E0433]: failed to resolve: use of undeclared type or module `registry`
--> tracing-subscriber/src/layer.rs:844:25
|
844 | S: for<'lookup> registry::LookupSpan<'lookup>,
| ^^^^^^^^ use of undeclared type or module `registry`
error[E0412]: cannot find type `Scope` in this scope
--> tracing-subscriber/src/layer.rs:842:28
|
842 | pub fn scope(&self) -> Scope<'_, S>
| ^^^^^ not found in this scope
error[E0425]: cannot find function, tuple struct or tuple variant `Scope` in this scope
--> tracing-subscriber/src/layer.rs:850:9
|
850 | Scope(scope)
| ^^^^^ help: a local variable with a similar name exists (notice the capitalization): `scope`
```
This PR fixes all reported clippy lints. In most cases I have made the
suggested changes. In a few cases (e.g., `blacklisted_name` and
`cognitive_complexity`) I've just silenced the warning.
I can make stylistic changes or discard some of the lint fixes if
preferred.
This branch updates all the README links and badges. In particular, it:
* Changes build status badges to GitHub Actions, since we've turned off
the Azure Pipelines CI build,
* Removes Gitter links, since nobody is on Gitter these days and we
probably don't want to point new users to an empty chat room,
* Make Discord links "actually work"
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Added:
- `LookupSpans` implementation for `Layered` (#448)
- `SpanRef::from_root` to iterate over a span's parents from the root
(#460)
- `Context::scope`, to iterate over the current context from the root
(#460)
- `Context::lookup_current`, which returns a `SpanRef` to the current
span's data (#460)
Changed:
- Lifetimes on some new `Context` methods to be less restrictive (#460)
Fixed:
- `Layer::downcast_ref` returning invalid references (#454)
- Compilation failure on 32-bit platforms (#462)
- Compilation failure with ANSI formatters (#438)
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
## Motivation
PR #443 added a `Context::scope` function, as described in issue #427.
In the issue, the desired change was described as follows:
> - The method `FmtContext::visit_spans` should be moved to
> `layer::Context`.
> - The new `layer::Context::visit_spans` should be renamed to
> something along the lines of `parents()`, **and be made to return
> a newtype implementing `std::iter::Iterator`.**
(emphasis mine)
Unfortunately, #443 only implemented the first bullet point — the new
`layer::Context::scope` method was just a copy of the old
`visit_spans`. The branch was merged before a thorough review could
determine that the change made there was incomplete.
## Solution
This branch changes the `Context::scope` function to return an
iterator, as it was supposed to, and rewrites `visit_spans` to consume
that iterator.
Additionally, I've added some new toys:
- `SpanRef::from_root`, which returns an iterator over a span's
parents, starting from the root (rather than from the direct parent
like `SpanRef::parents`)
- `Context::lookup_current`, which returns a `SpanRef` to the current
span's data
- Easier to use lifetimes on a couple of the new `Context` methods
enabled by the "registry" feature (not a breaking change, as these
aren't yet released)
Co-authored-by: David Barsky <me@davidbarsky.com>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Currently tracing-subscriber doesn't compile on 32-bit platforms as
discussed here: https://github.com/hawkw/sharded-slab/issues/9
The underlying issue has been fixed by @hawkw. This pull simply updates
the dependency inside this project. Thanks for your work on this!