--- title: "Longitudinal MAIHDA: intersectional inequalities over time" author: "Hamid Bulut" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Longitudinal MAIHDA: intersectional inequalities over time} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4.5 ) library(MAIHDA) ``` ## From a snapshot to a trajectory Standard (cross-sectional) MAIHDA answers "do intersectional strata differ?" at a single point in time, summarised by the between-stratum VPC. With repeated measurements we can ask a richer question: *"do strata differ in how they **change** over time?"* -- and decompose those trajectory differences into an additive part (the dimensions' main effects and their interactions with time) and a multiplicative part (an intersectional trajectory beyond additive). This is the longitudinal MAIHDA of Bell, Evans, Holman & Leckie (2024). It is a 3-level growth-curve model with measurement occasions (level 1) within individuals (level 2) within intersectional strata (level 3), with a random intercept and a slope on time at both the individual and stratum levels: $$ y_{tij} = \beta_0 + \beta_1 t + \underbrace{(u_{0j} + u_{1j}\,t)}_{\text{stratum}} + \underbrace{(v_{0ij} + v_{1ij}\,t)}_{\text{individual}} + e_{tij}. $$ Because the stratum random effect now has a slope, the between-stratum variance becomes a **function of time**: $$ \operatorname{Var}_S(t) = \sigma^2_{u0} + 2t\,\sigma_{u01} + t^2 \sigma^2_{u1}, \qquad \operatorname{VPC}_S(t) = \frac{\operatorname{Var}_S(t)} {\operatorname{Var}_S(t) + \operatorname{Var}_I(t) + \sigma^2_e}. $$ ## The data The bundled `maihda_long_data` is a simulated panel: 600 people, each measured over five waves, within 12 strata of gender $\times$ ethnicity $\times$ education. The trajectory differences are constructed to be mostly additive with one genuine interaction, so the decomposition below is interpretable. ```{r data} data(maihda_long_data) head(maihda_long_data) ``` It is long format, one row per person-occasion, with a person id (`id`) and a numeric time (`wave`). ## Fitting and the time-varying VPC Supply `id` and `time` to `fit_maihda()`. You only need to write the strata shorthand `(1 | var1:var2)`; the growth slopes on time are added for you. ```{r fit} m <- fit_maihda(wellbeing ~ wave + (1 | gender:ethnicity:education), data = maihda_long_data, id = "id", time = "wave") summary(m) ``` `summary()` reports the VPC at the baseline (reference time, the earliest wave) and the full trajectory of the VPC across the observed times. Plot it: ```{r vpc-traj} plot(m, type = "vpc_trajectory") # VPC(t), with the reference time marked plot(m, type = "trajectories") # predicted per-stratum mean trajectories ``` A rising VPC trajectory means the strata fan out over time (intersectional inequality grows). ## Decomposing the trajectory: additive vs. multiplicative `maihda(decomposition = "longitudinal")` (selected automatically when `id`/`time` are supplied) fits a null growth model and an adjusted growth model. The adjusted model adds the dimensions' main effects and their interactions with time (`dim:time`), so the stratum-level variance it leaves behind is the interaction beyond additive. ```{r decomp} a <- maihda(wellbeing ~ wave + (1 | gender:ethnicity:education), data = maihda_long_data, id = "id", time = "wave", decomposition = "longitudinal") a$pcv ``` The PCV is reported separately for the two pieces of the trajectory: * **`PCV_intercept`** -- the share of the *baseline* between-stratum inequality explained by the additive main effects. * **`PCV_slope`** -- the share of the *trajectory* (slope) between-stratum inequality explained additively. A high `PCV_slope` is the "trajectories are mostly additive" finding; the remainder is the multiplicative/interaction part. ```{r pcv-traj} plot(a, type = "pcv_trajectory") # the additive share over time ``` ## Scope and cautions * **Identifiability.** A stratum random slope needs enough occasions per stratum; sparse strata or few waves can give a singular lme4 fit (surfaced in the fit diagnostics) -- the brms engine handles this better. * **Not a single number.** The VPC is time-varying, so `extract_between_variance()` and `calculate_pvc()` deliberately refuse a longitudinal model; use the longitudinal decomposition above. * **Out of scope (v1).** Design-weighted, contextual (stratum $\times$ place $\times$ time), and `wemix`/`ordinal` longitudinal models are not yet supported. ## Reference Bell, A., Evans, C., Holman, D., & Leckie, G. (2024). Extending intersectional multilevel analysis of individual heterogeneity and discriminatory accuracy (MAIHDA) to study individual longitudinal trajectories, with application to mental health in the UK. *Social Science & Medicine*, 351, 116955.