1737 Commits

Author SHA1 Message Date
David Tolnay
4b1048d0ec
Merge pull request #1183 from serde-rs/arithmetic
Unify chunk size choice between float and string parsing
2024-08-23 12:41:46 -07:00
David Tolnay
f268173a9f
Unify chunk size choice between float and string parsing 2024-08-23 12:36:49 -07:00
David Tolnay
fec0376974
Merge pull request #1182 from CryZe/chunk-64bit
Ensure the SWAR chunks are 64-bit in more cases
2024-08-23 09:24:56 -07:00
Christopher Serr
3d837e1cc4 Ensure the SWAR chunks are 64-bit in more cases
Various architectures have support for 64-bit integers, but there are
Rust targets for those architectures where the pointer size is
intentionally just 32-bit. For SWAR this smaller pointer size would
negatively affect those targets, so this PR ensures the chunk size stays
64-bit on those targets.
2024-08-23 13:29:34 +02:00
Graham Esau
11fc61c7af
Add OccupiedEntry::shift_remove() and swap_remove() 2024-08-19 21:43:12 +01:00
Graham Esau
0ceb9d84fa
Add OccupiedEntry::remove_entry() (and shift/swap versions) 2024-08-19 21:39:10 +01:00
David Tolnay
50c4328e21
Merge pull request #1178 from iex-rs/tiny-bit-faster-hex
Optimize Unicode decoding by 1% 🤡
2024-08-19 10:28:01 -07:00
Alisa Sireneva
9ffb43a1e4 Optimize Unicode decoding by 1% 2024-08-19 11:40:08 +03:00
David Tolnay
6130f9b358
Release 1.0.125 v1.0.125 2024-08-14 22:30:33 -07:00
David Tolnay
cc7a1608c9
Touch up PR 1175 2024-08-14 22:29:30 -07:00
David Tolnay
0f942e5b52
Merge pull request 1175 from iex-rs/faster-backslash-u 2024-08-14 22:23:30 -07:00
David Tolnay
d8921cd29b
Merge pull request #1172 from iex-rs/faster-hex
Parse \uXXXX escapes faster
2024-08-12 13:26:33 -07:00
David Tolnay
b4bc6436ac
Merge pull request #1176 from dtolnay/miriname
Improve job names for miri jobs
2024-08-12 12:54:25 -07:00
David Tolnay
94a2aad7b7
Improve job names for miri jobs 2024-08-12 12:45:43 -07:00
David Tolnay
8073fc16b8
Merge pull request #1174 from iex-rs/miri-on-ci
Test on BE and 32-bit platforms on CI via Miri
2024-08-12 12:43:58 -07:00
Alisa Sireneva
96ae60445d Correct WTF-8 parsing
Closes #877.

This is a good time to make ByteBuf parsing more consistent as I'm
rewriting it anyway. This commit integrates the changes from #877 and
also handles a leading surrogate followed by a surrogate pair correctly.

This does not affect performance significantly.

Co-authored-by: Luca Casonato <hello@lcas.dev>
2024-08-12 21:12:03 +03:00
Alisa Sireneva
236cc8247d Simplify unicode escape handling
This does not affect performance.
2024-08-12 21:12:03 +03:00
Alisa Sireneva
2f28d106e6 Use the same UTF-8/WTF-8 impl for surrogates
This does not affect performance.
2024-08-12 21:12:03 +03:00
Alisa Sireneva
0e90b61b8c Format UTF-8 strings manually
This speeds up War and Peace 290 MB/s -> 330 MB/s (+15%).
2024-08-12 21:12:03 +03:00
Alisa Sireneva
a38dbf3708 Mark \u parsing as cold
This counterintuitively speeds up War and Peace 275 -> 290 MB/s (+5%) by
enabling inlining of encode_utf8 and extend_from_slice.
2024-08-12 21:10:32 +03:00
Alisa Sireneva
86d0e114e1 Parse \uXXXX escapes faster
When ignoring *War and Peace* (in Russian), this increases performance
from 640 MB/s to 1080 MB/s (+70%).

When parsing into String, the savings are moderate but still
significant: 275 MB/s to 320 MB/s (+15%).
2024-08-12 12:00:41 +03:00
Alisa Sireneva
81b1b61886 Test on BE and 32-bit platforms on CI via Miri 2024-08-12 12:00:27 +03:00
David Tolnay
cf771a0471
Release 1.0.124 v1.0.124 2024-08-11 14:05:55 -07:00
David Tolnay
8b314a77bf
Merge pull request #1173 from iex-rs/fix-big-endian
Oops, fix skip_to_escape on BE architectures
2024-08-11 14:04:54 -07:00
Alisa Sireneva
8eba7863b1 Fix skip_to_escape on BE architectures 2024-08-11 23:45:41 +03:00
David Tolnay
2cab07e686
Release 1.0.123 v1.0.123 2024-08-11 11:00:48 -07:00
David Tolnay
346189a524
Fix needless_borrow clippy lint in new control character test
warning: this expression creates a reference which is immediately dereferenced by the compiler
        --> tests/test.rs:2515:9
         |
    2515 |         &"\"\t\n\r\"",
         |         ^^^^^^^^^^^^^ help: change this to: `"\"\t\n\r\""`
         |
         = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow
         = note: `-W clippy::needless-borrow` implied by `-W clippy::all`
         = help: to override `-W clippy::all` add `#[allow(clippy::needless_borrow)]`
2024-08-11 10:45:43 -07:00
David Tolnay
859ead8e6d
Merge pull request #1161 from iex-rs/vectorized-string-parsing
Vectorize string parsing
2024-08-11 10:44:24 -07:00
Alisa Sireneva
e43da5ee0e Immediately bail-out on empty strings 2024-08-11 19:35:46 +03:00
Alisa Sireneva
8389d8a112 Don't run the slow algorithm from the beginning 2024-08-11 16:10:38 +03:00
Alisa Sireneva
1f0dcf791a Allow clippy::items_after_statements 2024-08-11 15:38:14 +03:00
Alisa Sireneva
a95d6df9d0 Big endian support 2024-08-11 15:30:26 +03:00
Alisa Sireneva
5496579070 Inline memchr2 logic into Mycroft's algorithm 2024-08-11 15:23:04 +03:00
David Tolnay
54381d6fee
Release 1.0.122 v1.0.122 2024-08-01 14:29:36 -07:00
David Tolnay
16fb6e0b85
Work around buggy rust-analyzer behavior
As far as I can tell there is still no way to block it from
autoimporting the private macros here.
2024-08-01 14:28:58 -07:00
David Tolnay
49d7d6626f
Merge pull request #1166 from dtolnay/allocvec
Fix `json!` invocations when std prelude is not in scope
2024-08-01 14:26:50 -07:00
David Tolnay
6827c7b3c5
Fix json! invocations when std prelude is not in scope 2024-08-01 14:22:46 -07:00
David Tolnay
611b2a4fb6
Merge pull request #1165 from serde-rs/jsonmac
Eliminate local_inner_macros in favor of non-ident macro paths
2024-08-01 14:22:40 -07:00
David Tolnay
7633cb7f05
Eliminate local_inner_macros in favor of non-ident macro paths 2024-08-01 14:18:05 -07:00
Alisa Sireneva
3063d69fd5 Add better tests 2024-07-29 13:23:01 +03:00
Alisa Sireneva
63cb04d74b Bring MSRV down 2024-07-29 13:05:39 +03:00
Alisa Sireneva
03ceee9eb1 Replace ESCAPE array with is_escape fn
This is not backed by benchmarks, but it seems reasonable that we'd be
more starved for cache than CPU in IO-bound tasks. It also simplifies
code a bit and frees up some memory, which is probably a good thing.
2024-07-29 12:23:38 +03:00
Alisa Sireneva
3faae037e9 Vectorize string parsing 2024-07-29 11:54:22 +03:00
David Tolnay
eca2658a22
Release 1.0.121 v1.0.121 2024-07-28 14:03:27 -07:00
David Tolnay
b0d678cfb4
Merge pull request #1160 from iex-rs/efficient-position
Optimize position search in error path
2024-07-28 14:02:33 -07:00
Alisa Sireneva
b1edc7d13f Optimize position search in error path
Translating index into a line/column pair takes considerable time.
Notably, the JSON benchmark modified to run on malformed data spends
around 50% of the CPU time generating the error object.

While it is generally assumed that the cold path is quite slow, such a
drastic pessimization may be unexpected, especially when a faster
implementation exists.

Using vectorized routines provided by the memchr crate increases
performance of the failure path by 2x on average.

Old implementation:
				DOM         STRUCT
	data/canada.json        122 MB/s    168 MB/s
	data/citm_catalog.json  135 MB/s    195 MB/s
	data/twitter.json       142 MB/s    226 MB/s

New implementation:
				DOM         STRUCT
	data/canada.json        216 MB/s    376 MB/s
	data/citm_catalog.json  238 MB/s    736 MB/s
	data/twitter.json       210 MB/s    492 MB/s

In comparison, the performance of the happy path is:

				DOM         STRUCT
	data/canada.json        283 MB/s    416 MB/s
	data/citm_catalog.json  429 MB/s    864 MB/s
	data/twitter.json       275 MB/s    541 MB/s

While this introduces a new dependency, memchr is much faster to compile
than serde, so compile time does not increase significantly.
Additionally, memchr provides a more efficient SWAR-based implementation
of both the memchr and count routines even without std, providing
benefits for embedded uses as well.
2024-07-27 01:31:00 +03:00
David Tolnay
40dd7f5e86
Merge pull request #1159 from iex-rs/fix-recursion
Move call to tri! out of check_recursion!
2024-07-26 12:24:57 -07:00
Alisa Sireneva
6a306e6ee9 Move call to tri! out of check_recursion! 2024-07-26 20:00:43 +03:00
David Tolnay
3f1c6de4af
Ignore byte_char_slices clippy lint in test
warning: can be more succinctly written as a byte str
        --> tests/test.rs:1108:13
         |
    1108 |             &[b'"', b'\n', b'"'],
         |             ^^^^^^^^^^^^^^^^^^^^ help: try: `b"\"\n\""`
         |
         = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#byte_char_slices
         = note: `-W clippy::byte-char-slices` implied by `-W clippy::all`
         = help: to override `-W clippy::all` add `#[allow(clippy::byte_char_slices)]`

    warning: can be more succinctly written as a byte str
        --> tests/test.rs:1112:13
         |
    1112 |             &[b'"', b'\x1F', b'"'],
         |             ^^^^^^^^^^^^^^^^^^^^^^ help: try: `b"\"\x1F\""`
         |
         = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#byte_char_slices
2024-07-11 20:09:16 -07:00
David Tolnay
3fd6f5f49d
Merge pull request #1153 from dpathakj/master
Correct documentation URL for Value's Index impl.
2024-07-02 17:24:09 -07:00