An item response theory model with bounded integer subcomponents to describe the Mayo Clinic subscores in patients with ulcerative colitis
Jurgen Langenhorst(1),* Anita Moein(2),* Sami Ullah(1) Matts Kågedal(2), Mats Magnusson(1), Nastya Kassir(2)
(1)Pharmetheus AB, Uppsala, Sweden, (2)Genentech, Inc., South San Francisco, CA, USA *Contributed equally
Introduction: Clinical efficacy of drugs in development for the treatment of ulcerative colitis (UC) is measured through the Mayo Clinic Score (MCS). MCS comprises the sum of rectal bleeding score (RB), stool frequency score (SF), physician’s global assessment score (PGA), and an endoscopic score, each being an integer of values 0-3. Key clinical parameters, such as “response” and “remission”, rely on all scores being available. However, endoscopies are burdensome for patients and typically measured only at the beginning and end of each study (period), while RB, SF and PGA are available more frequently. Drug development in UC would benefit from a longitudinal model that leverages information of the three other MCS subscores on the clinical status of patients prior to the endoscopy. Item response theory (IRT) models[1] are attractive as they allow the shared information to be used and predictions of all subscores to be made at each observation time.
Objectives: To propose an IRT model able to predict the longitudinal subscores of MCS and the remission status at end of induction in external data.
Methods: The model was developed based on the data from the phase 3 studies of etrolizumab[2]: HIBISCUS I/II, HICKORY, and LAUREL. Placebo data was used in the development of the base model and data from the active treatment arms was added for the purpose of increasing the numbers of observed remissions to better evaluate the accuracy of the model. Patients who switched treatment during the study were excluded. Patients were regarded in remission with MCS less than or equal to two, no subscore above one, and an RB of zero. For each subscore, a bounded integer (BI) model[3] was used, consisting of a BASE and standard deviation (SD) parameter. The placebo effect was described by a monoexponential function acting on all subscores similarly. Interindividual variability (IIV) was added per parameter, provided it improved the model fit without affecting the convergence. Upon addition of active treatment data, the placebo effect was fixed and a similar active treatment effect was estimated on top of the placebo effect. In studies HICKORY and LAUREL, patients classified as responders at the end of induction were allowed to enter the maintenance phase. For visual predictive checks (VPCs), this was taken into account by a dropout model, where the simulated scores at the end of induction guided the inclusion of the simulated subject in the maintenance phase. Next, the developed model was evaluated for predictive performance on external data by means of a VPC. The external data consisted of placebo arms of five studies in UC of various drug companies[4,5]: NCT385740, NCT408630, NCT410410, NCT787200, NCT853100.
Results: The analysis data set contained 238 patients (Nobs=6225) on placebo and 1152 patients (Nobs=32758) treated with etrolizumab 105 mg every 4 weeks. Based on the placebo data, we developed a BI model with IIV on the BASE parameter per subscore and IIV for SD shared across scores. We applied the placebo and active treatment effect similarly to all scores and had IIV included for the maximum effect. As such, subscore-specific parameters relied on baseline observations, while parameters describing post-baseline trends and variability were informed by all subscores equally. The model reliably predicted the analysis data with no misspecified trends in the goodness-of-fit plots nor in the VPC. Remission at the end of induction was correctly predicted 183 out of 242 times (76%) and non-remission was predicted correctly 986 out of 1071 times (92%). The external subscore data was predicted well for RB, PGA, and endoscopic score, but was underpredicted for SF. This was due to a lower mean SF baseline: 2.48 in the external data compared to 2.22 in the analysis data. Importantly, a VPC for remission indicated adequate performance for the external data regarding this key clinical measure: the 95% confidence interval of simulated remission encompassed the observed remission for each study.
Conclusions: The IRT model adequately described the subscores of MCS and reliably predicted the remission status for both the analysis data and the external data. These results suggest that the proposed model structure is a valid option for longitudinally describing the clinical disease status of patients with UC. Other drug effects can be inserted as deemed appropriate, allowing the model to propel model informed drug development across a range of UC programs.
References:
[1] Ueckert S. Modeling Composite Assessment Data Using Item Response Theory. CPT Pharmacometrics Syst Pharmacol. 2018;7(4):205-218.
[2] Sandborn WJ, Vermeire S, Tyrrell H, Hassanali A, Lacey S, Tole S, Tatro AR; Etrolizumab Global Steering Committee. Etrolizumab for the Treatment of Ulcerative Colitis and Crohn's Disease: An Overview of the Phase 3 Clinical Program. Adv Ther. 2020 Jul;37(7):3417-3431.
[3] Wellhagen GJ, Kjellsson MC, Karlsson MO. A Bounded Integer Model for Rating and Composite Scale Data. AAPS J. 2019 Jun 6;21(4):74.
[4] Yin PT, Desmond J, Day J (2019) Sharing historical trial data to accelerate clinical development. Clin Pharmacol Ther 106:1177–1178.
[5] Kawakatsu, S., Zhu, R., Zhang, W. et al. A longitudinal model for the Mayo Clinical Score and its sub-components in patients with ulcerative colitis. J Pharmacokinet Pharmacodyn 49, 179–190 (2022).