Stop guessing why your Power BI report takes 45 seconds to load. Start understanding the engine.
This repository is a hands-on, scenario-driven guide to writing performant DAX and building efficient Power BI models for retail and franchise operations.
It includes:
- A Python data generator that produces a realistic, 1M+ row star-schema dataset
- Three deep-dive scenarios comparing slow vs. optimized DAX patterns with engine-level explanations
- Data modeling golden rules specific to VertiPaq's columnar architecture
Every example uses the same generated dataset, so you can follow along in Power BI Desktop and validate with DAX Studio.
Most DAX tutorials teach you what functions do. Very few explain why one approach is 10x faster than another.
This repo bridges that gap. If you've ever experienced:
- Reports that crawl when you add a second visual to the page
CALCULATE+FILTER(ALL(...))patterns you copied from Stack Overflow but don't fully understand- CPU pinned at 100% during a simple YTD calculation
- A flat, 50-column table that "works" but takes 2 GB of RAM
...then this is for you.
| Tool | Version | Purpose |
|---|---|---|
| Python | 3.9+ | Data generation |
| pandas | 1.5+ | DataFrame operations |
| numpy | 1.23+ | Random data generation |
| Power BI Desktop | Latest | Report building & testing |
| DAX Studio | 3.x (optional) | Query profiling & VertiPaq analysis |
# 1. Clone the repo
git clone https://github.com/themsoft/powerbi-dax-optimization.git
cd powerbi-dax-optimization
# 2. Install dependencies
pip install pandas numpy
# 3. Generate the dataset (default: 1M rows)
python scripts/data_generator.py
# 4. (Optional) Generate a larger dataset for stress testing
python scripts/data_generator.py --rows 5000000Output files land in the data/ directory:
| File | Description | Rows |
|---|---|---|
dim_stores.csv |
Franchise branches with region, type, and size | ~200 |
dim_products.csv |
FMCG products with category hierarchy and pricing | ~500 |
dim_calendar.csv |
Complete date dimension (2022–2024) with fiscal year support | 1 096 |
fact_sales.csv |
Transactional sales with quantity, revenue, cost, and discounts | 1 000 000+ |
Then import all four CSV files into Power BI Desktop, set up relationships (StoreID, ProductID, Date), and mark dim_calendar as the date table.
| # | Document | Focus Area |
|---|---|---|
| 1 | Dynamic ABC Analysis | Product classification by revenue contribution — calculated column anti-pattern vs. measure-based approach |
| 2 | Franchise YTD Performance | Time intelligence, branch benchmarking, ALLEXCEPT vs. FILTER(ALL(...)) |
| 3 | Data Modeling Golden Rules | Star schema design, VertiPaq compression, why flat tables kill performance |
If you're new to DAX optimization, start with Scenario 3 (Data Modeling) — no amount of DAX tuning fixes a bad model. Then move to Scenarios 1 and 2 for measure-level optimizations.
powerbi-dax-optimization/
├── README.md
├── scripts/
│ └── data_generator.py # Python script — generates all CSV files
├── data/ # Generated CSV files (not tracked in git)
│ ├── dim_stores.csv
│ ├── dim_products.csv
│ ├── dim_calendar.csv
│ └── fact_sales.csv
└── docs/
├── 01_Scenario_ABC_Analysis.md
├── 02_Scenario_Franchise_YTD_Performance.md
└── 03_Data_Modeling_Golden_Rules.md
MIT