Comparing treatment effect in depression trials: Mixed Model for Repeated Measures vs Linear Mixed Model
Gijs Santen(1), Meindert Danhof(1), Oscar Della Pasqua(1,2)
1) Division of Pharmacology, LACDR, Leiden University, Leiden, the Netherlands. (2) Department of Clinical Pharmacokinetics/Modeling & Simulation, GlaxoSmithKline, Greenford, UK
Objectives: It is a well known fact that depression trials may fail in 50% of the cases even if effective doses of an antidepressant drug are administered. A high placebo effect, large variability between patients and inadequate endpoints are commonly given as reasons for this high failure rate. Therefore, investigations into alternative endpoints, novel study designs and statistical methods using historical data could lead to a reduction in the failure rate in clinical trials with anti-depressant drugs. In previous work we have explored the sensitivity of the Hamilton Depression Rating Scale (HAM-D) to treatment effect. The current work focuses on the impact of standard statistical analysis used to evaluate effect size in clinical trials, the Linear Mixed Model for Repeated Measures (MMRM) and compares it with a Linear Mixed Model (LMM).
Methods: Data from several double blind randomised placebo-controlled trials in Major Depression were extracted from GlaxoSmithKline's clinical database. Basically, the MMRM models repeated measures within a single individual as multivariate data with an unstructured covariance matrix that is assumed to identical across individuals. Treatment-time and baseline-time interactions are modelled as fixed effects. The LMM models HAMD response with the interactions treatment-time and baseline-time as fixed effects, but includes a subject-specific effect. MMRM analysis was performed in SAS using proc mixed, whilst LMM was also fitted in SAS using proc mixed and WinBUGS, with missing data replacement by the posterior predictive distribution for the specific individual.
Results: Re-analysis of study data revealed minor differences for the estimates obtained with either method. However, when diagnostic plots are evaluated, it is clear that the MMRM shows a bias relative to LMM. Model bias was especially evident across the range of responses in the observed vs predicted plots and over the time course of response of an individual patient. In a few occasions, MMRM resulted in incorrect estimates of significance level and consequently wrong conclusions about treatment effect.
Conclusions: The analysis of HAM-D data with an identical unstructured covariance matrix across individuals may not be appropriate. The use of the LMM with subject-specific random effects and missing data replacement based on posterior prediction distributions may offer a better alternative to current methodology for the assessment of treatment effect in depression.