this reduces the chances of hitting the 32-bit address space issue on x86_64
instead of (always) using a static ANCHOR variable located in .bss we lazily initialize the ANCHOR
variable using the value passed to the first `Ptr::new` invocation. In practice, this means the very
first `Pool::grow` (on x86_64) call is guaranteed to work (use the given memory). Follow up `grow`
invocations are *more likely* to work (but not guaranteed) *if* all given memory comes from the
heap.
We still need an ANCHOR in .bss as a fallback because it's possible to allocate ZST on a pool
without calling `Pool::grow` (= the lazily init ANCHOR is never initialized *but* it can be read)
the old logic didn't consider the pointee's alignment when creating a dangling pointer
dangling pointers are used in pools of ZST (Zero Sized Types). the old logic resulted in these Boxed
ZST not being well aligned. the test added in this PR was failing
in the CAS version of the Treiber stack, the 64-bit "pointers" (Ptr) are NonZero values because a
Node is an Option<Ptr> and that needs to be a 64-bit value.
after adding x86 support it became (theoretically) possible to break the non-zero invariant: Ptr is
a 32-bit tag plus a 32-bit pointer (or offset on x86_64). The tag can be zero. The 32-bit offset on
x86_64 can _not_ be zero (+) because the ANCHOR is one byte in memory and memory cannot overlap.
However, on x86 the pointer can(?) be zero (I mean on Cortex-M, address zero is a valid memory
location) so that's a problem.
This fixes the issue be turning the tag into a NonZeroU32 value. This way even if the offset/pointer
is 0 the NonZero invariant of Ptr is maintained. So care in needed when incrementing the tag to not
turn it into a zero value on wraparound
(+) spoilers: I have a follow up PR where the 32-bit offset on x86_64 can become zero so this PR is
prerequisite for that follow up PR
the implementation uses a 64-bit atomic on `x86` to avoid the `ANCHOR` variable and the address
space limitation seen with the x86_64 compilation target
this PR also adds the i686-linux-musl target to the test matrix to exercise the new implementation
closes#231
the logic was increasing the "capacity" counter even in the case the given memory address was out of
range (ANCHOR +- 2GB) -- `None` branch in code -- resulting in a wrong / misleading value being
reported
`capacity` is defined as `N - 1`. But best performance (cheap modulo) is achieved if `N` is a power of two, not `capacity`.
This clarifies the docstring.
... otherwise any crate using heapless will automatically add all the
funny ARMv6-M only dependencies regardless.
Signed-off-by: Daniel Egger <daniel@eggers-club.de>