This additional benchmark exercises a common request/reply pattern using an MPSC for requests along with a oneshot payload as a reply mechanism. When used in a current threaded scenario, the bench is 17 times faster on my machine than when using the multi-threaded runtime and one worker thread. Not only that, but if I increase the number of worker threads to 6, performance degrades further.
Does this suggest a scheduling problem with the multi-threaded runtime?
No matter what, hopefully the benchmarks are a useful addition.