bench: Improve the spectralnorm shootout benchmark#17989
Merged
Conversation
Contributor
Contributor
There was a problem hiding this comment.
I don't think that this function is safe in general, because f could mutate something it captured. However, it should be possible to encode the correct bound with unboxed closures.
Contributor
There was a problem hiding this comment.
FWIW, I experimented with a similar rewrite that used unboxed closures, and they worked well enough, i.e. no ICEs etc.
Member
Author
There was a problem hiding this comment.
Oh dear, you're right! I thought the Sync found was enough, but I think this needs to take an instance of Fn to ensure there's no &mut captures. I've updated with the unboxed closure equivalent of Fn, thanks!
This improves the spectralnorm shootout benchmark through a few vectors after looking at the leading C implementation: * The simd-based f64x2 is now used to parallelize a few computations * RWLock usage has been removed. A custom `parallel` function was added as a form of stack-based fork-join parallelism. I found that the contention on the locks was high as well as hindering other optimizations. This does, however, introduce one `unsafe` block into the benchmarks, which previously had none. In terms of timings, the before and after numbers are: ``` $ time ./shootout-spectralnorm-before ./shootout-spectralnorm-before 2.07s user 0.71s system 324% cpu 0.857 total $ time ./shootout-spectralnorm-before 5500 ./shootout-spectralnorm-before 5500 11.88s user 1.13s system 459% cpu 2.830 total $ time ./shootout-spectralnorm-after ./shootout-spectralnorm-after 0.58s user 0.01s system 280% cpu 0.210 tota $ time ./shootout-spectralnorm-after 5500 ./shootout-spectralnorm-after 5500 3.55s user 0.01s system 455% cpu 0.783 total ```
783f76d to
f7b5470
Compare
15 tasks
Contributor
|
Great! I'd r+ if I can ;) |
|
See also #18085 |
bors
added a commit
that referenced
this pull request
Oct 16, 2014
This improves the spectralnorm shootout benchmark through a few vectors after looking at the leading C implementation: * The simd-based f64x2 is now used to parallelize a few computations * RWLock usage has been removed. A custom `parallel` function was added as a form of stack-based fork-join parallelism. I found that the contention on the locks was high as well as hindering other optimizations. This does, however, introduce one `unsafe` block into the benchmarks, which previously had none. In terms of timings, the before and after numbers are: ``` $ time ./shootout-spectralnorm-before ./shootout-spectralnorm-before 2.07s user 0.71s system 324% cpu 0.857 total $ time ./shootout-spectralnorm-before 5500 ./shootout-spectralnorm-before 5500 11.88s user 1.13s system 459% cpu 2.830 total $ time ./shootout-spectralnorm-after ./shootout-spectralnorm-after 0.58s user 0.01s system 280% cpu 0.210 tota $ time ./shootout-spectralnorm-after 5500 ./shootout-spectralnorm-after 5500 3.55s user 0.01s system 455% cpu 0.783 total ```
lnicola
pushed a commit
to lnicola/rust
that referenced
this pull request
Sep 25, 2024
…ykril Provide an option to hide deprecated items from completion Fixes rust-lang#17989. I wonder if this should be instead done in the editor, that will do it in a language-agnostic way. Can't hurt to do it in rust-analyzer, I guess.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This improves the spectralnorm shootout benchmark through a few vectors after
looking at the leading C implementation:
parallelfunction was added as aform of stack-based fork-join parallelism. I found that the contention on the
locks was high as well as hindering other optimizations.
This does, however, introduce one
unsafeblock into the benchmarks, whichpreviously had none.
In terms of timings, the before and after numbers are: