Skip to content

themsoft/powerbi-dax-optimization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Advanced DAX & Power BI Optimization for Retail/Franchise Operations

Stop guessing why your Power BI report takes 45 seconds to load. Start understanding the engine.

Overview

This repository is a hands-on, scenario-driven guide to writing performant DAX and building efficient Power BI models for retail and franchise operations.

It includes:

  • A Python data generator that produces a realistic, 1M+ row star-schema dataset
  • Three deep-dive scenarios comparing slow vs. optimized DAX patterns with engine-level explanations
  • Data modeling golden rules specific to VertiPaq's columnar architecture

Every example uses the same generated dataset, so you can follow along in Power BI Desktop and validate with DAX Studio.

Why This Repo Exists

Most DAX tutorials teach you what functions do. Very few explain why one approach is 10x faster than another.

This repo bridges that gap. If you've ever experienced:

  • Reports that crawl when you add a second visual to the page
  • CALCULATE + FILTER(ALL(...)) patterns you copied from Stack Overflow but don't fully understand
  • CPU pinned at 100% during a simple YTD calculation
  • A flat, 50-column table that "works" but takes 2 GB of RAM

...then this is for you.

Prerequisites

Tool Version Purpose
Python 3.9+ Data generation
pandas 1.5+ DataFrame operations
numpy 1.23+ Random data generation
Power BI Desktop Latest Report building & testing
DAX Studio 3.x (optional) Query profiling & VertiPaq analysis

How to Generate the Data

# 1. Clone the repo
git clone https://github.com/themsoft/powerbi-dax-optimization.git
cd powerbi-dax-optimization

# 2. Install dependencies
pip install pandas numpy

# 3. Generate the dataset (default: 1M rows)
python scripts/data_generator.py

# 4. (Optional) Generate a larger dataset for stress testing
python scripts/data_generator.py --rows 5000000

Output files land in the data/ directory:

File Description Rows
dim_stores.csv Franchise branches with region, type, and size ~200
dim_products.csv FMCG products with category hierarchy and pricing ~500
dim_calendar.csv Complete date dimension (2022–2024) with fiscal year support 1 096
fact_sales.csv Transactional sales with quantity, revenue, cost, and discounts 1 000 000+

Then import all four CSV files into Power BI Desktop, set up relationships (StoreID, ProductID, Date), and mark dim_calendar as the date table.

Table of Contents

Scenarios

# Document Focus Area
1 Dynamic ABC Analysis Product classification by revenue contribution — calculated column anti-pattern vs. measure-based approach
2 Franchise YTD Performance Time intelligence, branch benchmarking, ALLEXCEPT vs. FILTER(ALL(...))
3 Data Modeling Golden Rules Star schema design, VertiPaq compression, why flat tables kill performance

Reading Order

If you're new to DAX optimization, start with Scenario 3 (Data Modeling) — no amount of DAX tuning fixes a bad model. Then move to Scenarios 1 and 2 for measure-level optimizations.

Project Structure

powerbi-dax-optimization/
├── README.md
├── scripts/
│   └── data_generator.py      # Python script — generates all CSV files
├── data/                       # Generated CSV files (not tracked in git)
│   ├── dim_stores.csv
│   ├── dim_products.csv
│   ├── dim_calendar.csv
│   └── fact_sales.csv
└── docs/
    ├── 01_Scenario_ABC_Analysis.md
    ├── 02_Scenario_Franchise_YTD_Performance.md
    └── 03_Data_Modeling_Golden_Rules.md

License

MIT

About

Advanced DAX & Power BI Optimization for Retail/Franchise Operations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages