Document memory orderings of `thread::{park, unpark}`
Document `thread::park/unpark` as having acquire/release synchronization. Without that guarantee, even the example in the documentation can deadlock:
```rust
let flag = Arc::new(AtomicBool::new(false));
let t2 = thread::spawn(move || {
while !flag.load(Ordering::Acquire) {
thread::park();
}
});
flag.store(true, Ordering::Release);
t2.thread().unpark();
// t1: flag.store(true)
// t1: thread.unpark()
// t2: flag.load() == false
// t2 now parks, is immediately unblocked but never
// acquires the flag, and thus spins forever
```
Multiple calls to `unpark` should also maintain a release sequence to make sure operations released by previous `unpark`s are not lost:
```rust
let a = Arc::new(AtomicBool::new(false));
let b = Arc::new(AtomicBool::new(false));
let t2 = thread::spawn(move || {
while !a.load(Ordering::Acquire) || !b.load(Ordering::Acquire) {
thread::park();
}
});
thread::spawn(move || {
a.store(true, Ordering::Release);
t2.thread().unpark();
});
b.store(true, Ordering::Release);
t2.thread().unpark();
// t1: a.store(true)
// t1: t2.unpark()
// t3: b.store(true)
// t3: t2.unpark()
// t2 now parks, is immediately unblocked but never
// acquires the store of `a`, only the store of `b` which
// was released by the most recent unpark, and thus spins forever
```
This is of course a contrived example, but is reasonable to rely upon in real code.
Note that all implementations of park/unpark already comply with the rules, it's just undocumented.
Don't claim `LocalKey::with` prevents a reference to be sent across threads
The documentation for `LocalKey` claims that `with` yields a reference that cannot be sent across threads, but this is false since you can easily do that with scoped threads. What it actually prevents is the reference from outliving the current thread.
Restructure and rename std thread_local internals to make it less of a maze
Every time I try to work on std's thread local internals, it feels like I'm trying to navigate a confusing maze made of macros, deeply nested modules, and types with multiple names/aliases. Time to clean it up a bit.
This PR:
- Exports `Key` with its own name (`Key`), instead of `__LocalKeyInner`
- Uses `pub macro` to put `__thread_local_inner` into a (unstable, hidden) module, removing `#[macro_export]`, removing it from the crate root.
- Removes the `__` from `__thread_local_inner`.
- Removes a few unnecessary `allow_internal_unstable` features from the macros
- Removes the `libstd_thread_internals` feature. (Merged with `thread_local_internals`.)
- And removes it from the unstable book
- Gets rid of the deeply nested modules for the `Key` definitions (`mod fast` / `mod os` / `mod statik`).
- Turns a `#[cfg]` mess into a single `cfg_if`, now that there's no `#[macro_export]` anymore that breaks with `cfg_if`.
- Simplifies the `cfg_if` conditions to not repeat the conditions.
- Removes useless `normalize-stderr-test`, which were left over from when the `Key` types had different names on different platforms.
- Removes a seemingly unnecessary `realstd` re-export on `cfg(test)`.
This PR changes nothing about the thread local implementation. That's for a later PR. (Which should hopefully be easier once all this stuff is a bit cleaned up.)
For larger applications it's important that users set `RUST_MIN_STACK`
at the start of their program because `min_stack` caches the value.
Not doing so can lead to their `env::set_var` call surprisingly not having any effect.
This allows removing all the platform-dependent code from `library/std/src/thread/local.rs` and `library/std/src/thread/mod.rs`
Signed-off-by: Ayush Singh <ayushsingh1325@gmail.com>
Move __thread_local_inner macro in crate:🧵:local to crate::sys.
Currently, the tidy check does not fail for `library/std/src/thread/local.rs` even though it contains platform specific code. This is beacause target_family did not exist at the time the tidy checks were written [1].
[1]: https://github.com/rust-lang/rust/pull/105861#discussion_r1125841678
Signed-off-by: Ayush Singh <ayushsingh1325@gmail.com>
std tests: use __OsLocalKeyInner from realstd
This is basically the same as https://github.com/rust-lang/rust/pull/100201, but for __OsLocalKeyInner:
Some std tests are failing in Miri on Windows because [this static](a377893da2/library/std/src/sys/windows/thread_local_key.rs (L234-L239)) is getting duplicated, and Miri does not handle that properly -- Miri does not support this magic `.CRT$XLB` linker section, but instead just looks up this particular hard-coded static in the standard library. This PR lets the test suite use the std static instead of having its own copy.
Fixes https://github.com/rust-lang/miri/issues/2754
r? `@thomcc`
Unify id-based thread parking implementations
Multiple platforms currently use thread-id-based parking implementations (NetBSD and SGX[^1]). Even though the strategy does not differ, these are duplicated for each platform, as the id is encoded into an atomic thread variable in different ways for each platform.
Since `park` is only called by one thread, it is possible to move the thread id into a separate field. By ensuring that the field is only written to once, before any other threads access it, these accesses can be unsynchronized, removing any restrictions on the size and niches of the thread id.
This PR also renames the internal `thread_parker` modules to `thread_parking`, as that name now better reflects their contents. I hope this does not add too much reviewing noise.
r? `@m-ou-se`
`@rustbot` label +T-libs
[^1]: SOLID supports this as well, I will switch it over in a follow-up PR.
Forbid inlining `thread_local!`'s `__getit` function on Windows
Sadly, this will make things slower to avoid UB in an edge case, but it seems hard to avoid... and really whenever I look at this code I can't help but think we're asking for trouble.
It's pretty dodgy for us to leave this as a normal function rather than `#[inline(never)]`, given that if it *does* get inlined into a dynamically linked component, it's extremely unsafe (you get some other thread local, or if you're lucky, crash). Given that it's pretty rare for people to use dylibs on Windows, the fact that we haven't gotten bug reports about it isn't really that convincing. Ideally we'd come up with some kind of compiler solution (that avoids paying for this cost when static linking, or *at least* for use within the same crate...), but it's not clear what that looks like.
Oh, and because all this is only needed when we're implementing `thread_local!` with `#[thread_local]`, this patch adjusts the `cfg_attr` to be `all(windows, target_thread_local)` as well.
r? ``@ChrisDenton``
See also #84933, which is about improving the situation.