Improve & compute statistical significance

I'm getting to the point where changes in e.g. `abstract-level` result in benchmark differences that are too small to say anything meaningful about it. Which is a good sign, but it means that the benchmarks must become more precise to still be useful. Rough plan:

- [ ] Measure timer resolution, aka average smallest measurable time
- [ ] Determine minimum duration of benchmark by `Math.max(resolutionInSeconds() / 2 / 0.01, 0.05) * 1e3`
- [ ] Let `iterations` be 1
- [ ] Optionally run benchmark for warmup: `fn(); fn(); eval('%OptimizeFunctionOnNextCall(fn)'); fn()`
- [ ] Run benchmark, which should call a function `iterations` amount of times
- [ ] Optionally subtract time spent on GC
- [ ] If needed, increase `iterations` to satisfy minimum duration, and repeat
- [ ] If minimum duration is met, record the duration in a histogram
- [ ] Optionally also factor in [`histogram.minimumSize()`](https://github.com/vweevers/student-histogram)
- [ ] Then summarize a la https://github.com/Level/bench/commit/1ebed316a227d72b4e45acf33bbed3a00a8890cf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve & compute statistical significance #40

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Improve & compute statistical significance #40

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions