Add function core::iter::zip
This makes it a little easier to `zip` iterators:
```rust
for (x, y) in zip(xs, ys) {}
// vs.
for (x, y) in xs.into_iter().zip(ys) {}
```
You can `zip(&mut xs, &ys)` for the conventional `iter_mut()` and
`iter()`, respectively. This can also support arbitrary nesting, where
it's easier to see the item layout than with arbitrary `zip` chains:
```rust
for ((x, y), z) in zip(zip(xs, ys), zs) {}
for (x, (y, z)) in zip(xs, zip(ys, zs)) {}
// vs.
for ((x, y), z) in xs.into_iter().zip(ys).zip(xz) {}
for (x, (y, z)) in xs.into_iter().zip((ys.into_iter().zip(xz)) {}
```
It may also format more nicely, especially when the first iterator is a
longer chain of methods -- for example:
```rust
iter::zip(
trait_ref.substs.types().skip(1),
impl_trait_ref.substs.types().skip(1),
)
// vs.
trait_ref
.substs
.types()
.skip(1)
.zip(impl_trait_ref.substs.types().skip(1))
```
This replaces the tuple-pair `IntoIterator` in #78204.
There is prior art for the utility of this in [`itertools::zip`].
[`itertools::zip`]: https://docs.rs/itertools/0.10.0/itertools/fn.zip.html
Add IEEE 754 compliant fmt/parse of -0, infinity, NaN
This pull request improves the Rust float formatting/parsing libraries to comply with IEEE 754's formatting expectations around certain special values, namely signed zero, the infinities, and NaN. It also adds IEEE 754 compliance tests that, while less stringent in certain places than many of the existing flt2dec/dec2flt capability tests, are intended to serve as the beginning of a roadmap to future compliance with the standard. Some relevant documentation is also adjusted with clarifying remarks.
This PR follows from discussion in https://github.com/rust-lang/rfcs/issues/1074, and closes#24623.
The most controversial change here is likely to be that -0 is now printed as -0. Allow me to explain: While there appears to be community support for an opt-in toggle of printing floats as if they exist in the naively expected domain of numbers, i.e. not the extended reals (where floats live), IEEE 754-2019 is clear that a float converted to a string should be capable of being transformed into the original floating point bit-pattern when it satisfies certain conditions (namely, when it is an actual numeric value i.e. not a NaN and the original and destination float width are the same). -0 is given special attention here as a value that should have its sign preserved. In addition, the vast majority of other programming languages not only output `-0` but output `-0.0` here.
While IEEE 754 offers a broad leeway in how to handle producing what it calls a "decimal character sequence", it is clear that the operations a language provides should be capable of round tripping, and it is confusing to advertise the f32 and f64 types as binary32 and binary64 yet have the most basic way of producing a string and then reading it back into a floating point number be non-conformant with the standard. Further, existing documentation suggested that e.g. -0 would be printed with -0 regardless of the presence of the `+` fmt character, but it prints "+0" instead if given such (which was what led to the opening of #24623).
There are other parsing and formatting issues for floating point numbers which prevent Rust from complying with the standard, as well as other well-documented challenges on the arithmetic level, but I hope that this can be the beginning of motion towards solving those challenges.
Make # pretty print format easier to discover
# Rationale:
I use (cargo cult?) three formats in rust: `{}`, debug `{:?}`, and pretty-print debug `{:#?}`. I discovered `{:#?}` in some blog post or guide when I started working in Rust. While `#` is documented I think it is hard to discover. So taking the good advice of ```@carols10cents``` I am trying to improve the docs with a PR
As a reminder "pretty print" means that where `{:?}` will print something like
```
foo: { b1: 1, b2: 2}
```
`{:#?}` will prints something like
```
foo {
b1: 1
b2: 3
}
```
# Changes
Add an example to `fmt` to try and make it easier to discover `#`
Fixes#83046
The program
fn main() {
println!("{:?}", '"');
println!("{:?}", "'");
}
would previously print
'\"'
"\'"
With this patch it now prints:
'"'
"'"
This seems to have been omitted from the beginning when this feature
was first introduced in 86bf96291d82.
Most users won't need to name this type which is probably why this
wasn't noticed in the meantime.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
"semantic equivalence" is too strong a phrasing here, which is why
actually explaining what kind of circumstances might produce a -0
was chosen instead.
This commit removes the previous mechanism of differentiating
between "Debug" and "Display" formattings for the sign of -0 so as
to comply with the IEEE 754 standard's requirements on external
character sequences preserving various attributes of a floating
point representation.
In addition, numerous tests are fixed.
Add license metadata for std dependencies
These five crates are in the dependency tree of `std` but lack license metadata:
- `alloc`
- `core`
- `panic_abort`
- `panic_unwind`
- `unwind`
Querying the dependency tree of `std` is a useful thing to be able to do, since these crates will typically be linked into Rust binaries. Tools show the license fields missing, as seen in https://github.com/rust-lang/rust/issues/67014#issuecomment-782704534. This PR adds the license field for the five crates, based on the license of the `std` package and this repo as a whole. I also added the `repository` and `descriptions` fields, since those seem useful. For `description`, I copied text from top-level comments for the respective modules - except for `unwind` which has none.
I also note that https://github.com/rust-lang/rust/pull/73530 attempted to add license metadata for all crates in this repo, but was rejected because there was question about some of them. I hope that this smaller change, focusing only on the runtime dependencies, will be easier to review.
cc `@Mark-Simulacrum` `@Lokathor`
Fix invalid slice access in String::retain
As noted in #78499, the previous fix was technically still unsound because it accessed elements of a slice outside its bounds (even though they were still inside the same allocation). This PR addresses that concern by switching to a dropguard approach.
This allows the optimizer to turn certain iterator pipelines such as
```rust
let vec = vec![0usize; 100];
vec.into_iter().map(|e| e as isize).collect::<Vec<_>>()
```
into a noop.
The optimization only applies when iterator sources are `T: Copy`
since `impl TrustedRandomAccess for IntoIter<T>`.
No such requirement applies to the output type (`Iterator::Item`).
Fix overflowing length in Vec<ZST> to VecDeque
`Vec` can hold up to `usize::MAX` ZST items, but `VecDeque` has a lower
limit to keep its raw capacity as a power of two, so we should check
that in `From<Vec<T>> for VecDeque<T>`. We can also simplify the
capacity check for the remaining non-ZST case.
Before this fix, the new test would fail on the length:
```
thread 'collections::vec_deque::tests::test_from_vec_zst_overflow' panicked at 'assertion failed: `(left == right)`
left: `0`,
right: `9223372036854775808`', library/alloc/src/collections/vec_deque/tests.rs:474:5
note: panic did not contain expected string
panic message: `"assertion failed: `(left == right)`\n left: `0`,\n right: `9223372036854775808`"`,
expected substring: `"capacity overflow"`
```
That was a result of `len()` using a mask `& (size - 1)` with the
improper length. Now we do get a "capacity overflow" panic as soon as
that `VecDeque::from(vec)` is attempted.
Fixes#80167.
Implement String::remove_matches
Closes#50206.
I lifted the function help from `@frewsxcv's` original PR (#50015), hope they don't mind.
I'm also wondering whether it would be useful for `remove_matches` to collect up the removed substrings into a `Vec` and return them, right now they're just overwritten by the copy and lost.
Add more links between hash and btree collections
- Link from `core::hash` to `HashMap` and `HashSet`
- Link from HashMap and HashSet to the module-level documentation on
when to use the collection
- Link from several collections to Wikipedia articles on the general
concept
See also https://github.com/rust-lang/rust/pull/81989#issuecomment-783920840.
Vec::dedup_by optimization
Now `Vec::dedup_by` drops items in-place as it goes through them.
From my benchmarks, it is around 10% faster when T is small, with no major regression when otherwise.
I used `ptr::copy` instead of conditional `ptr::copy_nonoverlapping`, because the latter had some weird performance issues on my ryzen laptop (it was 50% slower on it than on intel/sandybridge laptop)
It would be good if someone was able to reproduce these results.
`Vec` can hold up to `usize::MAX` ZST items, but `VecDeque` has a lower
limit to keep its raw capacity as a power of two, so we should check
that in `From<Vec<T>> for VecDeque<T>`. We can also simplify the
capacity check for the remaining non-ZST case.
Before this fix, the new test would fail on the length:
```
thread 'collections::vec_deque::tests::test_from_vec_zst_overflow' panicked at 'assertion failed: `(left == right)`
left: `0`,
right: `9223372036854775808`', library/alloc/src/collections/vec_deque/tests.rs:474:5
note: panic did not contain expected string
panic message: `"assertion failed: `(left == right)`\n left: `0`,\n right: `9223372036854775808`"`,
expected substring: `"capacity overflow"`
```
That was a result of `len()` using a mask `& (size - 1)` with the
improper length. Now we do get a "capacity overflow" panic as soon as
that `VecDeque::from(vec)` is attempted.
convert slice doc link to intra-doc links
Continuing where #80189 stopped, with `core::slice`.
I had an issue with two dead links in my doc when implementing `Deref<Target = [T]>` for one of my type. This means that [`binary_search_by_key`](https://doc.rust-lang.org/nightly/std/primitive.slice.html#method.binary_search_by_key) was available, but not [`sort_by_key`](https://doc.rust-lang.org/nightly/std/primitive.slice.html#method.sort_by_key) even though it was linked in it's doc (same issue with [`as_ptr`](https://doc.rust-lang.org/nightly/std/primitive.slice.html#method.as_ptr) and [`as_mut_pbr`](https://doc.rust-lang.org/nightly/std/primitive.slice.html#method.as_mut_ptr)). It becomes available if I implement `DerefMut`, as it needs an `&mut self`.
<details>
<summary>Code that will have dead links in its doc</summary>
```rust
pub struct A;
pub struct B;
impl std::ops::Deref for B{
type Target = [A];
fn deref(&self) -> &Self::Target {
&A
}
}
```
</details>
I removed the link to `sort_by_key` from `binary_search_by_key` doc as I didn't find a nice way to have a live link:
- `binary_search_by_key` is in `core`
- `sort_by_key` is in `alloc`
- intra-doc link `slice::sort_by_key` doesn't work, as `alloc` is not available when `core` is being build (the warning can't be ignored: ```error[E0710]: an unknown tool name found in scoped lint: `rustdoc::broken_intra_doc_links` ```)
- keeping the link as an anchor `#method.sort_by_key` meant a dead link
- an absolute link would work but doesn't feel right...
Stabilize `unsafe_op_in_unsafe_fn` lint
This makes it possible to override the level of the `unsafe_op_in_unsafe_fn`, as proposed in https://github.com/rust-lang/rust/issues/71668#issuecomment-729770896.
Tracking issue: #71668
r? ```@nikomatsakis``` cc ```@SimonSapin``` ```@RalfJung```
# Stabilization report
This is a stabilization report for `#![feature(unsafe_block_in_unsafe_fn)]`.
## Summary
Currently, the body of unsafe functions is an unsafe block, i.e. you can perform unsafe operations inside.
The `unsafe_op_in_unsafe_fn` lint, stabilized here, can be used to change this behavior, so performing unsafe operations in unsafe functions requires an unsafe block.
For now, the lint is allow-by-default, which means that this PR does not change anything without overriding the lint level.
For more information, see [RFC 2585](https://github.com/rust-lang/rfcs/blob/master/text/2585-unsafe-block-in-unsafe-fn.md)
### Example
```rust
// An `unsafe fn` for demonstration purposes.
// Calling this is an unsafe operation.
unsafe fn unsf() {}
// #[allow(unsafe_op_in_unsafe_fn)] by default,
// the behavior of `unsafe fn` is unchanged
unsafe fn allowed() {
// Here, no `unsafe` block is needed to
// perform unsafe operations...
unsf();
// ...and any `unsafe` block is considered
// unused and is warned on by the compiler.
unsafe {
unsf();
}
}
#[warn(unsafe_op_in_unsafe_fn)]
unsafe fn warned() {
// Removing this `unsafe` block will
// cause the compiler to emit a warning.
// (Also, no "unused unsafe" warning will be emitted here.)
unsafe {
unsf();
}
}
#[deny(unsafe_op_in_unsafe_fn)]
unsafe fn denied() {
// Removing this `unsafe` block will
// cause a compilation error.
// (Also, no "unused unsafe" warning will be emitted here.)
unsafe {
unsf();
}
}
```
Improve sift_down performance in BinaryHeap
Replacing `child < end - 1` with `child <= end.saturating_sub(2)` in `BinaryHeap::sift_down_range` (surprisingly) results in a significant speedup of `BinaryHeap::into_sorted_vec`. The same substitution can be done for `BinaryHeap::sift_down_to_bottom`, which causes a slight but probably statistically insignificant speedup for `BinaryHeap::pop`. It's interesting that benchmarks aside from `bench_into_sorted_vec` are barely affected, even those that do use `sift_down_*` methods internally.
| Benchmark | Before (ns/iter) | After (ns/iter) | Speedup |
|--------------------------|------------------|-----------------|---------|
| bench_find_smallest_1000<sup>1</sup> | 392,617 | 385,200 | 1.02 |
| bench_from_vec<sup>1</sup> | 506,016 | 504,444 | 1.00 |
| bench_into_sorted_vec<sup>1</sup> | 476,869 | 384,458 | 1.24 |
| bench_peek_mut_deref_mut<sup>3</sup> | 518,753 | 519,792 | 1.00 |
| bench_pop<sup>2</sup> | 446,718 | 444,409 | 1.01 |
| bench_push<sup>3</sup> | 772,481 | 770,208 | 1.00 |
<sup>1</sup>: internally calls `sift_down_range`
<sup>2</sup>: internally calls `sift_down_to_bottom`
<sup>3</sup>: should not be affected
Add {BTreeMap,HashMap}::try_insert
`{BTreeMap,HashMap}::insert(key, new_val)` returns `Some(old_val)` if the key was already in the map. It's often useful to assert no duplicate values are inserted.
We experimented with `map.insert(key, val).unwrap_none()` (https://github.com/rust-lang/rust/issues/62633), but decided that that's not the kind of method we'd like to have on `Option`s.
`insert` always succeeds because it replaces the old value if it exists. One could argue that `insert()` is never the right method for panicking on duplicates, since already handles that case by replacing the value, only allowing you to panic after that already happened.
This PR adds a `try_insert` method that instead returns a `Result::Err` when the key already exists. This error contains both the `OccupiedEntry` and the value that was supposed to be inserted. This means that unwrapping that result gives more context:
```rust
map.insert(10, "world").unwrap_none();
// thread 'main' panicked at 'called `Option::unwrap_none()` on a `Some` value: "hello"', src/main.rs:8:29
```
```rust
map.try_insert(10, "world").unwrap();
// thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value:
// OccupiedError { key: 10, old_value: "hello", new_value: "world" }', src/main.rs:6:33
```
It also allows handling the failure in any other way, as you have full access to the `OccupiedEntry` and the value.
`try_insert` returns a reference to the value in case of success, making it an alternative to `.entry(key).or_insert(value)`.
r? ```@Amanieu```
Fixes https://github.com/rust-lang/rfcs/issues/3092