Skip to content

Duration g-computation: predict joint_lm per-dyad on synthetic population in build_netstats #73

@smjenness

Description

@smjenness

Context

Identified during PR #71 review. duration.method = \"joint_lm\" fits a covariate-adjusted log-linear regression of partnership duration on ego + partner + matching terms, which corrects for confounding of the duration estimate by other attributes — the direct analog of the marginal-vs-joint fix that method = \"joint\" applies to formation stats in #61 / #62.

But currently the aggregation step (stratum-level medians fed into the geometric transformation for mean.dur.adj) is done by predicting at the ARTnet observations, not the synthetic target population:

# In R/NetParams.R::fit_joint_lm_dur():
fitted_dur <- exp(predict(fit, newdata = ongoing))   # ongoing = ARTnet observations

# In R/NetParams.R::compute_alt_durs() for joint_lm:
pred_sub <- exp(predict(res$model, newdata = sub))   # sub = ARTnet observations in stratum k

This gives a "confounding-corrected within ARTnet" estimate — strictly better than empirical, but not fully g-computed against whatever target population the user builds in build_netstats.

Proposed approach

Move the stratum-level aggregation from build_netparams into build_netstats under method = \"joint\". Specifically:

  1. In build_netparams(..., duration.method = \"joint_lm\", method = \"joint\"), store the fitted lm at netparams$<layer>$joint_duration_model (already done) but do not compute stratum-level medians at this stage. Emit only the model.
  2. In build_netstats(..., method = \"joint\"), after the synthetic population is constructed, pull netparams$<layer>$joint_duration_model and predict per-synthetic-dyad expected log-duration. Aggregate per stratum (non-matched, matched × index.age.grp) by taking medians of the per-dyad predicted durations. Feed through the existing geometric transformation to produce mean.dur.adj for dissolution_coefs().

The per-dyad prediction requires synthetic partnerships, not just synthetic nodes. For nonmatched-strata medians, pair synthetic nodes across age groups; for matched, within age group. This is the same (ego, alter) pair construction problem #63 raised for nodematch/absdiff — the dyad-level infrastructure from there is reusable.

Interaction with sex.cess.mod

The current sex.cess.mod code path sets dissolution rates to 1 for the post-cessation age group. That post-processing should still run after the new synthetic-population aggregation. No semantic change needed; just make sure the override logic fires in the right order.

Acceptance criteria

  • duration.method = \"joint_lm\" with method = \"joint\" in build_netstats produces stratum-level mean.dur.adj from per-synthetic-dyad predictions, not from ARTnet observations.
  • duration.method = \"joint_lm\" with method = \"existing\" continues to produce the current within-ARTnet estimates (since there's no synthetic population to predict against at that stage).
  • Dissolution coefs under a shifted synthetic population (e.g., race.prop = c(0.35, 0.25, 0.40)) visibly diverge from the current ARTnet-conditional estimates, paralleling the nodefactor/nodematch divergence PR Use joint GLM g-computation for build_netstats target stats (#62) #68 / Joint dyad-level modeling: nodematch + absdiff (#63 phases 1 & 2) #69 showed for formation stats.
  • Backward-compat snapshot harness still matches 3/3 under method = \"existing\".

Where this lives

This is a direct continuation of the ARTnet joint g-comp refactor (#61-#63, #69, #71). Doesn't require any upstream tergm changes — the output still feeds the existing geometric-dissolution offset the same way.

Alternative location: the ARTnetPredict project, as part of the Phase 2+ forward-projection work. The logic there is: if we're going to post-stratify ARTnet to NHBS MSM demographics or to AMIS 2022-24 projections, the synthetic target population is more directly specified, and the per-dyad duration prediction matters more. In that context, this gap should be closed before the forward-projection is considered complete.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions