Skip to content

Improve & compute statistical significance #40

@vweevers

Description

@vweevers

I'm getting to the point where changes in e.g. abstract-level result in benchmark differences that are too small to say anything meaningful about it. Which is a good sign, but it means that the benchmarks must become more precise to still be useful. Rough plan:

  • Measure timer resolution, aka average smallest measurable time
  • Determine minimum duration of benchmark by Math.max(resolutionInSeconds() / 2 / 0.01, 0.05) * 1e3
  • Let iterations be 1
  • Optionally run benchmark for warmup: fn(); fn(); eval('%OptimizeFunctionOnNextCall(fn)'); fn()
  • Run benchmark, which should call a function iterations amount of times
  • Optionally subtract time spent on GC
  • If needed, increase iterations to satisfy minimum duration, and repeat
  • If minimum duration is met, record the duration in a histogram
  • Optionally also factor in histogram.minimumSize()
  • Then summarize a la 1ebed31

Metadata

Metadata

Assignees

No one assigned

    Labels

    benchmarkRequires or pertains to benchmarking

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions