Summary
Add decomposition support for statistical aggregation functions, enabling pre-aggregation of variance, standard deviation, covariance, and correlation metrics in materialized cubes.
Implementation
Variance & Standard Deviation (Single-column)
Mathematical basis: VAR_POP = E[X²] - E[X]²
| Function |
Components |
Combine Formula |
VAR_POP(x) |
sum, sum_sq, count |
(sum_sq/n) - (sum/n)² |
VAR_SAMP(x) |
sum, sum_sq, count |
(n*sum_sq - sum²) / (n*(n-1)) |
STDDEV_POP(x) |
sum, sum_sq, count |
SQRT(VAR_POP) |
STDDEV_SAMP(x) |
sum, sum_sq, count |
SQRT(VAR_SAMP) |
Component definitions:
ComponentDef("_sum", "SUM", "SUM") # sum of x
ComponentDef("_sum_sq", "SUM(POWER({}, 2))", "SUM") # sum of x²
ComponentDef("_count", "COUNT", "SUM") # count
Covariance (Two-column)
Mathematical basis: COVAR_POP = E[XY] - E[X]*E[Y]
| Function |
Components |
Combine Formula |
COVAR_POP(x, y) |
sum_x, sum_y, sum_xy, count |
(sum_xy/n) - (sum_x/n)*(sum_y/n) |
COVAR_SAMP(x, y) |
sum_x, sum_y, sum_xy, count |
(n*sum_xy - sum_x*sum_y) / (n*(n-1)) |
Component definitions:
ComponentDef("_sum_x", "SUM({0})", "SUM", arg_index=0)
ComponentDef("_sum_y", "SUM({1})", "SUM", arg_index=1)
ComponentDef("_sum_xy", "SUM({0} * {1})", "SUM", arg_index=None) # product of args
ComponentDef("_count", "COUNT({0})", "SUM", arg_index=0)
Correlation (Two-column)
Mathematical basis: CORR = COVAR(X,Y) / (STDDEV(X) * STDDEV(Y))
| Function |
Components |
Notes |
CORR(x, y) |
sum_x, sum_y, sum_x_sq, sum_y_sq, sum_xy, count |
6 components |
Component definitions:
ComponentDef("_sum_x", "SUM({0})", "SUM", arg_index=0)
ComponentDef("_sum_y", "SUM({1})", "SUM", arg_index=1)
ComponentDef("_sum_x_sq", "SUM(POWER({0}, 2))", "SUM", arg_index=0)
ComponentDef("_sum_y_sq", "SUM(POWER({1}, 2))", "SUM", arg_index=1)
ComponentDef("_sum_xy", "SUM({0} * {1})", "SUM", arg_index=None)
ComponentDef("_count", "COUNT({0})", "SUM", arg_index=0)
Combine formula:
numerator = n * sum_xy - sum_x * sum_y
var_x = n * sum_x_sq - sum_x²
var_y = n * sum_y_sq - sum_y²
CORR = numerator / SQRT(var_x * var_y)
Summary
Add decomposition support for statistical aggregation functions, enabling pre-aggregation of variance, standard deviation, covariance, and correlation metrics in materialized cubes.
Implementation
Variance & Standard Deviation (Single-column)
Mathematical basis:
VAR_POP = E[X²] - E[X]²VAR_POP(x)(sum_sq/n) - (sum/n)²VAR_SAMP(x)(n*sum_sq - sum²) / (n*(n-1))STDDEV_POP(x)SQRT(VAR_POP)STDDEV_SAMP(x)SQRT(VAR_SAMP)Component definitions:
Covariance (Two-column)
Mathematical basis:
COVAR_POP = E[XY] - E[X]*E[Y]COVAR_POP(x, y)(sum_xy/n) - (sum_x/n)*(sum_y/n)COVAR_SAMP(x, y)(n*sum_xy - sum_x*sum_y) / (n*(n-1))Component definitions:
Correlation (Two-column)
Mathematical basis:
CORR = COVAR(X,Y) / (STDDEV(X) * STDDEV(Y))CORR(x, y)Component definitions:
Combine formula: