Informative Censoring in Clinical Trials — Visual Cheat Sheet

Exploratory analyses · Missing data mechanisms · Imputation methods · R & SAS · Visual examples · ALT-FLOW II (NCT05686317) context

FDA IDE context R code SAS code

Section 1 — Missing data mechanisms

MCAR — Missing completely at random

P(R | Y_obs, Y_mis) = P(R)

Concept: Dropout has no relationship to any variable — observed or unobserved. A pure coin-flip mechanism. The missing patients look identical to completers on every dimension.

Clinical example: Patient misses year-3 visit because their flight was cancelled by a snowstorm. Their current health status is irrelevant to why they are missing.

Patient population — MCAR (20% dropout scattered randomly)

Present Missing (random)

Safe: Complete-case analysis is unbiased. Only loses statistical power proportional to % missing.

# Little's MCAR test
library(naniar)
mcar_test(df[, c("bnp","pcwp","nyha","time")])
# H0=MCAR; p > 0.05 supports MCAR
# SAS: PROC MI NIMPUTE=0; VAR ...; RUN;

MAR — Missing at random

P(R | Y_obs, Y_mis) = P(R | Y_obs)

Concept: Dropout depends on observed variables, but NOT on the unobserved value itself. Once you condition on what you can see, dropout is random. Untestable — always an assumption.

Clinical example: Older patients (age recorded) drop out more in ALT-FLOW II. But within any age group, which specific patients drop out is still random.

Patient population — MAR (older patients drop out more)

Present (younger) Present (older) Missing (older)

Recoverable: Multiple imputation using observed predictors of dropout produces valid estimates. This is the FDA primary analysis standard.

MNAR — Missing not at random

P(R | Y_obs, Y_mis) ≠ P(R | Y_obs)

Concept: Dropout depends on the unobserved value itself. Sicker patients leave because they are sick — and that deterioration is the very thing you are trying to measure.

Clinical example (ALT-FLOW II): Patients with rising BNP and worsening PCWP after the APTURE procedure stop attending visits. BNP IS the primary endpoint, so missingness is directly driven by the outcome.

Patient population — MNAR (sickest patients missing)

Present (stable) Present (at risk) Missing (deteriorating)

Always biased if ignored. FDA requires MNAR sensitivity analyses for all IDE trials with >10% dropout.

Detection decision flow — run analyses in this order

A1 Baseline SMD table

→

SMD>0.10?

→

A2 KM + tick marks

→

Marks cluster?

→

A3 Reverse KM

→

Hazard spike?

→

A4 Dropout regression

→

BNP/PCWP significant?

→

MNAR confirmed → Run A5–A9 + Joint model + Tipping point

Section 2 — Detection analyses (A1–A4)

A1 · Baseline table — SMD by censoring status Priority: Critical

Concept: Compare baseline characteristics between completers and dropouts. Use Standardized Mean Difference (SMD) — works like an effect size across variables. SMD >0.10 = meaningful imbalance = evidence against MCAR.

ALT-FLOW II: Focus on BNP, PCWP at exercise, NYHA class, KCCQ-OSS, 6MWT — these are both eligibility criteria AND likely dropout drivers.

SMD love plot — completers vs dropouts (simulated ALT-FLOW II)

Unadjusted After MI | = 0.10 threshold

library(tableone)
vars <- c("bnp","pcwp_ex","nyha","kccq_oss",
          "sixmwt","egfr","afib","lvef")
tab <- CreateTableOne(vars=vars,
  strata="censored", data=df, test=TRUE)
print(tab, smd=TRUE)  # Flag: SMD > 0.10

BNP (SMD=0.41) and PCWP (SMD=0.38) are the outcomes being measured — their imbalance directly proves MNAR in ALT-FLOW II.

A2 · Kaplan-Meier curves + censoring tick marks Priority: Critical

Concept: Standard KM curve with vertical tick marks (|) at each censoring time. Random censoring = marks scattered evenly across the curve. Informative censoring = marks cluster at specific survival levels or time points.

ALT-FLOW II signal: Watch for a spike of tick marks at month 12 in the sham arm — this is where unblinding occurs and sham patients may seek crossover.

KM survival curves — device vs sham arm (5-year follow-up)

Device arm Sham arm | = tick marks cluster at month 12 sham arm — informative

Risk table: Device: 50–49–47–44–40–34–20 | Sham: 50–48–43–38–26–17–7
Cum. censor Sham: 0–1–3–8–16–24–32 — jump of 5 at month 12 = burst censoring

library(ggsurvfit)
survfit(Surv(time,event)~arm, data=df) |>
  ggsurvfit() +
  add_censor_mark(shape="|", size=2.5) +
  add_risktable(risktable_stats=c(
    "n.risk","cum.censor"))

A3 · Reverse KM — censoring-as-event hazard Priority: High

Concept: Flip the event indicator: treat censoring as "1" (the event). The resulting curve shows WHEN the hazard of being censored peaks. A flat curve = random dropout spread evenly. A sharp step = burst censoring at a specific time = informative.

Mechanism: In progressive disease like HFpEF, dropout hazard naturally accelerates in years 2–4 when patients deteriorate. A step at month 12 specifically in the sham arm = unblinding-triggered crossover MNAR.

Reverse KM — P(not yet censored) over time by arm

Device arm (gradual — MAR consistent) Sham arm (step at 12m — MNAR signal)

Sharp step at month 12 sham arm = 5 patients censored in 1 month. Device arm shows no corresponding step. This asymmetry is the key MNAR signal.

df$cens_event <- 1 - df$maccre_event
rev_km <- survfit(
  Surv(time, cens_event) ~ arm, data=df)
plot(rev_km, xlab="Months")
abline(v=12, lty=2, col="red")

A4 · Logistic regression of dropout on predictors Priority: Critical — formal MNAR test

Concept: Fit logistic regression with "dropped out = yes/no" as outcome. Predictors are TIME-VARYING clinical values at last available visit. If BNP or PCWP change predicts dropout — MNAR confirmed. This is the formal statistical proof.

Key rule: Use last-observed values before censoring, not baseline. Never include post-treatment colliders.

Forest plot — dropout odds ratios (ALT-FLOW II simulated)

BNP at last visit (OR 2.31) and PCWP non-response (OR 1.84) significantly predict dropout. Both are the primary outcomes — MNAR confirmed. Joint model (A8) is now mandatory.

mod <- glm(dropped_out ~
  bnp_last + pcwp_6m_change +
  nyha_last + kccq_last +
  sixmwt_last + arm + site,
  family=binomial(), data=df)
exp(cbind(OR=coef(mod), confint(mod)))

Section 3 — Mechanism analyses (A5–A7)

A5 · Longitudinal biomarker trajectory divergence — the MNAR visual signature FDA exhibit

Concept: Plot the mean BNP/PCWP trajectory over time, separately for completers and eventual dropouts. In an MNAR world — which is expected in ALT-FLOW II — the dropout group's BNP rises in the months BEFORE their censoring date, while completers' BNP stays stable or falls. This divergence happening BEFORE the dropout event is the definitive MNAR proof.

Key insight: If trajectories were identical right up to censoring time (and only then diverged due to the event), censoring would be non-informative. It is the pre-censoring divergence that damns the data.

Device arm — BNP trajectory by dropout group

Completers Dropouts

Sham arm — BNP trajectory by dropout group

Completers Dropouts

Both arms show BNP rising in the dropout group starting at month 6, diverging sharply from completers by month 12. This divergence BEFORE censoring = MNAR. Report this plot as Figure 1 in the CSR missingness section.

df_long |> ggplot(aes(x=visit_month, y=log_bnp,
    group=id, colour=dropout_group)) +
  geom_line(alpha=0.25) +
  stat_summary(aes(group=dropout_group), fun=mean, geom="line", linewidth=1.5) +
  stat_summary(fun.data=mean_se, geom="ribbon", alpha=0.15) +
  facet_wrap(~arm) + geom_vline(xintercept=12, linetype="dashed")

A7 · 6MWT / KCCQ functional decline dropout

Concept: Does current functional capacity at visit N predict missing visit N+1? Patients with very low 6MWT (<100m) may be physically unable to travel to clinic. This creates a systematic optimistic filter — only the mobile patients remain observable.

% missing next visit by current 6MWT distance

Patients walking <100m are 13× more likely to miss the next visit than those walking >350m. MNAR through functional decline.

mod_miss <- glm(
  sixmwt_miss_next ~
  sixmwt_current +
  kccq_current + arm,
  family=binomial(), data=df)
# Neg. coeff + sig. p = MNAR

Section 4 — MNAR adjustment analyses (A8–A9)

A8 · Joint model — MNAR-adjusted treatment estimate JMbayes2 · Regulatory gold standard

Concept: Two sub-models run simultaneously and share a random effect: (1) a mixed model for each patient's BNP trajectory, and (2) a Cox model for dropout hazard. The association parameter alpha (α) quantifies how much the biomarker trajectory drives dropout. If α is significant, MNAR is confirmed AND the treatment HR from the joint model is automatically MNAR-adjusted.

library(JMbayes2)
lme_bnp <- lme(log_bnp ~ visit*arm+age, random=~visit|id, data=df_long)
cox_drop <- coxph(Surv(t_drop,dropped)~arm+age, data=df, x=TRUE)
jm_fit <- jm(cox_drop, lme_bnp, time_var="visit",
  functional_forms=~value(log_bnp))
summary(jm_fit)  # alpha: BNP–dropout link; arm HR: MNAR-adjusted effect
# SAS: PROC NLMIXED — complex; prefer R/JMbayes2 for joint models

Output to report: (1) alpha estimate and CI — proves MNAR mechanistically. (2) Treatment HR from joint model — use as co-primary sensitivity if MNAR confirmed by A4.

A9 · Tipping point + Reference-based imputation Mandatory FDA IDE sensitivity

Concept: Systematically assume censored patients did δ units WORSE than imputed under MAR. Run primary analysis across a range of δ values. Find the tipping point — the δ at which your conclusion reverses. Then argue clinically whether that δ is plausible given HFpEF natural history.

Tipping point — PCWP treatment difference by delta assumption

Tipping point occurs at δ=6 mmHg. A delta of 6 mmHg worsening for ALL censored patients is clinically implausible given HFpEF progression rates — supporting robustness.

Jump-to-Reference (J2R) — most conservative FDA strategy

J2R assumes sham dropouts' KCCQ reverts to sham arm trajectory. Device benefit remains significant even under J2R — strongest robustness evidence.

# Delta tipping point
run_delta <- function(d) {
  imp <- mice(df, m=50, seed=42)
  for(i in 1:50) { comp <- complete(imp,i)
    comp$pcwp[is.na(df$pcwp)] <- comp$pcwp[is.na(df$pcwp)] - d }
  pool(with(imp, lm(pcwp~arm+age)))
}
map(0:10, run_delta)

# J2R via rbmi
library(rbmi)
drw <- draws(df_long, df_ice,
  vars=set_vars(subjid="id",visit="mo",outcome="kccq",
    group="arm",covariates=c("age","bnp")),
  method=method_bayes(2000))
imp <- impute(drw, references=c("APTURE"="Sham","Sham"="Sham"))
pool(analyse(imp, fun=ancova, vars=...))
# SAS: %MI_NCOD(DSN=df_long, STRATEGY=J2R, NIMPUTE=50);

Section 5 — Missing data imputation methods — complete reference

❌ Methods to AVOID as primary analysis — all produce biased standard errors

AVOID PRIMARY Mean imputation

Replace missing with variable mean. Distorts distribution, shrinks variance, underestimates SE. Confidence intervals too narrow — p-values falsely significant.

df$x[is.na(df$x)] <-
  mean(df$x, na.rm=TRUE)
# SE too small — p-values WRONG

AVOID PRIMARY LOCF

Last observation carried forward. Clinically wrong for progressive HF — disease worsens after dropout. ICH E9 R1 explicitly deprecated for primary analyses. OK as one of many sensitivity checks only.

library(zoo)
df$x_locf <- na.locf(
  df$x, na.rm=FALSE)
# Sensitivity only — not primary

AVOID PRIMARY Single regression imputation

Predict missing from covariates once. Treats imputed value as real observed data. Completely ignores imputation uncertainty — SE too small — CIs too narrow. Always use MULTIPLE imputation.

mod <- lm(x~age+bnp,
  data=df[!is.na(df$x),])
df$x_imp <- predict(mod, df)
# WRONG: use mice() instead

Complete imputation methods reference

Method	Mechanism	Bias-free?	Concept + Example	R code	SAS code	Regulatory status
Complete-case analysis Listwise deletion	MCAR	✓ Valid	Drop any patient with missing data. Valid + unbiased under MCAR only. Loses power but not bias. Safe when Little's MCAR test p>0.05 and SMD table shows balance. Example: 500 patients, 22% missing. Analyse n=390. HR estimate unbiased but wider CI.	df_cc <- df[complete.cases(df),] cox <- coxph( Surv(t,ev)~arm+age, data=df_cc)	PROC PHREG DATA=df; WHERE NMISS(bnp,nyha)=0; MODEL t*ev(0) =arm age; RUN;	Acceptable if MCAR confirmed
Multiple imputation — MICE mice package	MAR	✓ Valid	Create m≥20 complete datasets using iterative conditional models (chained equations). Fit analysis in each separately. Pool estimates and variances using Rubin's rules. Rule: m ≥ fraction_missing × 100. For 22% missing → m ≥ 22. Use m=50 for publications. FMI >0.30 = flag. Meng 1994: Always include outcome in imputation model — omitting it biases toward null.	library(mice) imp <- mice(df, m=50, seed=42, method=c( bnp="pmm", nyha="polyreg", died="logreg"), maxit=50) fit <- with(imp, coxph(Surv(t,ev) ~arm+age)) pool(fit)	PROC MI DATA=df NIMPUTE=50; CLASS nyha died; FCS LOGISTIC(died) REGPMM(bnp=50); RUN; PROC PHREG; BY _Imputation_; RUN; PROC MIANALYZE;	FDA primary preference
FIML Full info. max likelihood	MAR	✓ Valid	Maximise likelihood over observed data only, integrating out missing values. No imputed datasets needed. Best for continuous outcomes and SEM / longitudinal models. Default in PROC MIXED. Example: Mixed model for repeated BNP measures. PROC MIXED uses FIML by default — patients with partial follow-up contribute all observed visits.	library(lavaan) sem(model, data=df, missing="fiml", estimator="MLR") # For LME (default FIML): lme(y~time*arm, random=~time\|id, method="ML")	PROC MIXED DATA=df; CLASS id arm; MODEL y=time arm timearm/solution; REPEATED / SUBJECT=id TYPE=UN; RUN; / FIML is default */	Good for longitudinal continuous
Inverse probability weighting (IPW) WeightIt package	MAR	✓ Valid	Model P(observed \| covariates). Upweight each observed patient by 1/P(observed) to represent those who dropped out. Use stabilised weights to prevent extreme values. Truncate at 99th percentile. Example: Older patients more likely to drop out. Surviving older patients get higher weights to represent all older patients including dropouts.	library(WeightIt) w <- weightit( observed~age+bnp+nyha, data=df, method="ps", estimand="ATE") df$wts <- w$weights coxph(Surv(t,ev)~arm, data=df, weights=wts, robust=TRUE)	PROC LOGISTIC DATA=df; MODEL obs=age bnp nyha; OUTPUT OUT=p P=ps; RUN; DATA df2; SET p; IPW=1/ps; RUN;	Good alternative to MICE
Pattern mixture + delta (tipping point) mice + custom delta loop	MNAR	⚠ Assumption-based	Stratify by dropout pattern. Add δ shift to imputed values for censored patients. Sweep δ from 0 (MAR) to implausible extremes. Find tipping point (conclusion reversal). Report clinical plausibility argument. ALT-FLOW II: tipping point at δ=6 mmHg on PCWP. Clinical experts judge δ>3 mmHg implausible — finding robust up to δ=5.	run_delta <- function(d){ imp <- mice(df, m=50) for(i in 1:50) { c <- complete(imp,i) miss <- is.na(df$pcwp) c$pcwp[miss] <- c$pcwp[miss] - d } pool(with(imp, lm(pcwp~arm+age))) } map(0:10, run_delta)	PROC MI DATA=df NIMPUTE=50; VAR pcwp arm age; MONOTONE REG(pcwp=arm age); RUN; DATA delta_adj; SET imp_data; IF dropped=1 THEN pcwp=pcwp-δ RUN;	FDA mandatory for IDE
Reference-based (J2R / CR / CIR) rbmi package	MNAR	⚠ Conservative by design	ICH E9 R1 framework. After dropout: J2R=trajectory jumps to control arm (most conservative, FDA preferred); CR=copies control arm slope; CIR=copies control arm increments. ALT-FLOW II J2R: after sham patients drop out, their KCCQ trajectory is replaced by the mean sham-arm KCCQ trajectory. Device effect still significant under J2R — strongest robustness argument.	library(rbmi) drw <- draws(df_long, df_ice, vars=set_vars( subjid="id", visit="mo", outcome="kccq", group="arm", covariates=c( "age","bnp")), method=method_bayes( 2000)) imp <- impute(drw, references=c( "APTURE"="Sham", "Sham"="Sham")) pool(analyse(imp,...))	%MI_NCOD( DSN=df_long, SUBJECT=id, RESPONSE=kccq, GROUP=arm, TIME=month, STRATEGY=J2R, NIMPUTE=50); /* Also: CR, CIR */	FDA/EMA preferred
Joint model (shared parameter) JMbayes2 package	MNAR	⚠ Model-dependent	Simultaneously models longitudinal biomarker + dropout hazard via shared random effect (α). α directly tests MNAR. Treatment HR from model is MNAR-adjusted automatically. Gold standard when rich longitudinal data available. ALT-FLOW II: α=0.82 [0.61,1.04] p<0.001 — each 1-unit rise in log(BNP) raises dropout hazard by e^0.82=2.3×. Treatment HR from joint model = MNAR-corrected causal estimate.	library(JMbayes2) lme_fit <- lme( log_bnp~time*arm+age, random=~time\|id, data=df_long) cox_fit <- coxph( Surv(t,drop)~arm+age, data=df, x=TRUE) jm <- jm(cox_fit, lme_fit, time_var="time", functional_forms= ~value(log_bnp)) summary(jm)	PROC NLMIXED DATA=df; /* Complex — prefer JMbayes2 via R / PROC MCMC DATA=df; / custom likelihood */ RUN;	Regulatory gold standard

Section 6 — Rubin's rules, MICE method selector, regulatory quick reference

Rubin's rules — pooling MI estimates

Q̅ = (1/m) Σ Q̅_i [pooled estimate] T = W + (1 + 1/m)·B [total variance] W = within-imputation variance B = between-imputation variance FMI = (1 + 1/m)·B / T [fraction missing info] df ≈ (m−1)(1 + W/((1+1/m)·B))²

FMI interpretation — fraction of missing information

Rule of thumb: m ≥ fraction_missing × 100. For 22% missing → m ≥ 22. Use m=50 for publications.

FMI >0.30: That variable has high imputation uncertainty — flag in the CSR, consider auxiliary variables.

pooled <- pool(fit_list)
summary(pooled,
  conf.int=TRUE,
  exponentiate=TRUE)
# fmi column = fraction
# missing information

MICE imputation method selector by variable type

Variable type	Method	Code
Continuous (BNP, PCWP, 6MWT)	Predictive mean matching	`"pmm"`
Binary (death, AF, re-hospitalisation)	Logistic regression	`"logreg"`
Unordered categorical (site, sex)	Polytomous logistic	`"polyreg"`
Ordered (NYHA class I–IV)	Proportional odds	`"polr"`
Count / right-skewed (BNP)	PMM on log scale	`"pmm"`
Time-to-event	NEVER impute directly	`""`
Treatment arm	NEVER impute	`""`

Meng 1994 rule: Always include the outcome variable in the imputation model. Excluding it biases estimates toward the null.

Convergence check: After running mice(), always call plot(imp) to inspect trace plots. The mean and SD chains should mix well with no visible trend or separation.

# Check imputation model spec
meth <- make.method(df)
meth[c("time","arm")] <- ""
pred <- make.predictorMatrix(df)
pred[,"id"] <- 0  # never use ID
imp <- mice(df, method=meth,
  predictorMatrix=pred)

Regulatory quick reference — FDA IDE + ICH E9 R1

FDA expectations for ALT-FLOW II (G180033)

Primary analysis: MICE (MAR assumption, m≥50, all clinical predictors included)

Sensitivity 1: Tipping point (δ sweep 0–10 mmHg on PCWP, clinically calibrated)

Sensitivity 2: Jump-to-Reference via rbmi (most conservative)

Sensitivity 3: Joint model (JMbayes2) — if A4 confirms MNAR

ICH E9 R1 estimand: Pre-specify crossover handling before unblinding: "while on originally assigned treatment" OR "hypothetical no-crossover" OR "composite."

Convergence argument (strongest): If MICE + tipping point + J2R + joint model all show significant treatment effect — robust evidence for FDA. Report this explicitly in CSR Section 9.3.

Key references: Rubin 1987 · Little & Rubin 2019 · van Buuren 2018 · ICH E9 R1 (2019) · White et al. Stat Med 2011 · Carpenter & Smuk 2021 (rbmi)

Informative Censoring Cheat Sheet — Visual Edition · Clinical Biostatistics Reference · ALT-FLOW II (NCT05686317) context · R: survival, ggsurvfit, mice, JMbayes2, rbmi, naniar, WeightIt, AIPW · SAS: PROC MI, PROC PHREG, PROC MIXED, %MI_NCOD

← Back to karimuzzaman.com