Skip to content

Add sort integration benchmark#13306

Merged
alamb merged 3 commits intoapache:mainfrom
2010YOUY01:sort-bench
Nov 15, 2024
Merged

Add sort integration benchmark#13306
alamb merged 3 commits intoapache:mainfrom
2010YOUY01:sort-bench

Conversation

@2010YOUY01
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Closes #.

Rationale for this change

I noticed there is no benchmark to test sorting the whole relational table: existing sort benchmark is only for a single SortExec, and this can't test how would end-to-end large sort query scale to multiple CPU cores. With integration test, it is possible to see the combined performance of local sort on small batches and final step multi way sort-preserving merge.

The benchmark includes 10 queries to sort the entire lineitem table in TPCH dataset, with different characteristics. For example: with different number of sort key/ payload columns, different sort key types and cardinality, etc. Also it is easy to add more benchmark queries.
More details see sort_integration.rs

What changes are included in this PR?

Added a single benchmark binary for benchmark. It can be executed with:

# Under benchmarks/
./bench.sh run sort_integration
Q1 iteration 0 took 211.0 ms and returned 6001215 rows
Q1 iteration 1 took 186.7 ms and returned 6001215 rows
Q1 iteration 2 took 184.2 ms and returned 6001215 rows
Q1 iteration 3 took 185.4 ms and returned 6001215 rows
Q1 iteration 4 took 189.4 ms and returned 6001215 rows
Q1 avg time: 191.36 ms
Q2 iteration 0 took 156.9 ms and returned 6001215 rows
Q2 iteration 1 took 163.4 ms and returned 6001215 rows
Q2 iteration 2 took 166.2 ms and returned 6001215 rows
Q2 iteration 3 took 162.5 ms and returned 6001215 rows
Q2 iteration 4 took 169.5 ms and returned 6001215 rows
Q2 avg time: 163.70 ms
Q3 iteration 0 took 806.1 ms and returned 6001215 rows
Q3 iteration 1 took 812.8 ms and returned 6001215 rows
......

Query run results comparing sorting lineitem table with scaling factor 1 and 5

┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃ sort-lineitem-sf1 ┃ sort-lineitem-sf5 ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ Q1           │          186.49ms │         1244.18ms │  6.67x slower │
│ Q2           │          156.91ms │         1000.38ms │  6.38x slower │
│ Q3           │          804.57ms │         4682.39ms │  5.82x slower │
│ Q4           │          241.32ms │         1464.02ms │  6.07x slower │
│ Q5           │          407.56ms │         2037.05ms │  5.00x slower │
│ Q6           │          441.52ms │         2193.89ms │  4.97x slower │
│ Q7           │          786.11ms │         7000.62ms │  8.91x slower │
│ Q8           │          535.87ms │         2835.62ms │  5.29x slower │
│ Q9           │          532.31ms │         2957.57ms │  5.56x slower │
│ Q10          │          841.96ms │         9289.74ms │ 11.03x slower │
└──────────────┴───────────────────┴───────────────────┴───────────────┘

Are these changes tested?

Are there any user-facing changes?

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants