Integrated modeling of digital-motor outcomes and clinical outcome assessments using IRT: a framework for developing better outcomes for clinical trials in rare neurological diseases
Alzahra Hamdan (1), Andreas Traschütz (2,3), Lukas Beichert (2,3), Xiaomei Chen (1), Rebecca Schüle (2,4), Andrew C. Hooker (1), PROSPAX consortium, Matthis Synofzik (2,3), Mats O. Karlsson (1)
(1) Pharmacometrics Research Group, Department of Pharmacy, Uppsala University, Sweden (2) Department of Neurodegenerative Diseases, Center for Neurology and Hertie Institute for Clinical Brain Research, University of Tübingen, Tübingen, Germany (3) German Center for Neurodegenerative Diseases (DZNE) Tübingen, 72076 Tübingen, Germany (4) Division of Neurodegenerative Diseases, Department of Neurology, Heidelberg University Hospital, Germany
Background: Effective trial planning for rare neurological diseases is often hindered by the lack of robust and sensitive endpoints. This is the case for spastic ataxias, a large group of rare neurological diseases with combined cerebellar and pyramidal features [1]. Ataxia severity is assessed using several outcomes, including clinician-reported outcomes, patient-reported outcomes, fluid biomarkers, and digital-motor outcomes [2]. The scale for the assessment and rating of ataxia (SARA) is the most widely used clinical outcome assessment in ataxias [3,4]. The digital-motor outcomes offer objective measures with high sensitivity even in early and pre-ataxic disease stages [5–7]. These outcomes provide a quantitative assessment of walking patterns by wearable APDM sensors [8] and of voluntary movements by Q-Motor stationary sensors [9]. In a previous work, we provided evidence of the adequacy of SARA and its items using Item Response Theory (IRT) [10]. We here extend the SARA IRT framework to model digital-motor outcomes and evaluate their contribution in assessing ataxia severity.
Methods: Data were obtained from the prospective multicenter natural history study for spastic ataxias; PROSPAX [11] (NCT04297891). The dataset, in this analysis, contains SARA sub-scores and digital-motor measures from 186 patients with 331 visits. The digital-motor outcomes comprise 15 APDM measures and 20 Q-Motor measures.
A joint IRT model for continuous and categorical data was built for the first time (to the extent of our knowledge) allowing for the analysis of the underlying disease severity (iLV) based on two types of outcomes; i.e., SARA and digital-motor. The joint model constitutes: (i) A SARA IRT model with logistic functions relating the probabilities of items’ responses to iLV as developed in [10], and (ii) continuous IRT functions relating the digital-motor outcomes to the same iLV scale. Both linear and non-linear models were tested for each digital outcome measure. Model selection was based on likelihood ratio test (p-value 0.05), parameters plausibility and model stability. Models were implemented using NONMEM version 7.5.0 [12].
The digital-motor outcomes were assessed in terms of their ability to add information to SARA. This was measured as the average improvement in the uncertainty of the estimated iLV, the slope of digital outcome vs. iLV and the variability in digital outcome data explained by iLV. The assessment was conducted in multiple steps:
- Evaluation of the digital outcomes individually: The Q-Motor or APDM measures were added separately to the model with SARA. The best model describing the digital outcome data was first selected, then the digital measures were ranked based on their performance.
- Joint assessment of digital measures: The 6 top-ranked measures from each type were modeled jointly along with SARA. Residuals’ correlations (of digital measures) were estimated using a full-block L2 correlations.
- Reducing the joint models based on the estimated correlations in step 2; i.e. measures with very high correlations -likely reflecting similar underlying disease aspects- were removed.
Results: The IRT framework of SARA was validated in this independent ataxia population. It was then extended successfully to model continuous data and thus allowing for informed selection of the most informative digital-motor outcomes. When evaluated individually, the 6 top-ranked Q-Motor measures improved the iLV uncertainty by 17-33% (on average) compared to the IRT model for only SARA. Similarly, the top 6 APDM measures improved the uncertainty by 8-38%.
High correlations were estimated between 3 of the top 6 Q-motor measures, hence 2 of them were removed. The reduced joint model showed an improvement in iLV uncertainty in 97% of the cases with an average magnitude of 40%. The joint APDM model was reduced to 3 measures in which the average improvement in iLV uncertainty was 35% and with an improvement in 92% of the cases. Serving as a reference, SARA items improve the iLV uncertainty by 6-23% on average compared to SARA without the respectively tested SARA item. This finding corroborates the -compared to the SARA- relatively large contribution of the digital-motor outcomes.
Conclusions: The joint IRT analysis of a clinician-reported outcome (SARA) and digital-motor outcomes provides a framework for a more robust assessment of disease severity in spastic ataxias. Such an integrated framework is particularly promising for rare diseases where trial designs with small sample sizes and short duration are inherently needed.
Acknowledgment: This work was supported by the European Joint Programme on Rare Diseases (EJP RD) Joint Transnational Call 2019 for the EJP RD WP20 Innovation Statistics consortium “EVIDENCE-RND”. Moreover, work in this project was supported by the Clinician Scientist programme "PRECISE.net" funded by the Else Kröner-Fresenius-Stiftung (to A.T., L.B, M.S. and R.S.)
References:
[1] Synofzik M, Schüle R. Overcoming the divide between ataxias and spastic paraplegias: Shared phenotypes, genes, and pathways. Mov Disord. 2017;32(3):332–45.
[2] Ilg W, Milne S, Schmitz-Hübsch T, et al. Quantitative Gait and Balance Outcomes for Ataxia Trials: Consensus Recommendations by the Ataxia Global Initiative Working Group on Digital-Motor Biomarkers. Cerebellum. 2023; Available from: https://doi.org/10.1007/s12311-023-01625-2
[3] Schmitz-Hübsch T, du Montcel ST, Baliko L, et al. Scale for the assessment and rating of ataxia: development of a new clinical scale. Neurology. 2006;66(11):1717–20.
[4] Klockgether T, Synofzik M, Alhusaini S, et al. Consensus Recommendations for Clinical Outcome Assessments and Registry Development in Ataxias: Ataxia Global Initiative (AGI) Working Group Expert Guidance. Cerebellum. 2023; Available from: https://doi.org/10.1007/s12311-023-01547-z
[5] Ilg W, Fleszar Z, Schatton C, et al. Individual changes in preclinical spinocerebellar ataxia identified via increased motor complexity. Mov Disord. 2016;31(12):1891–900.
[6] Ilg W, Müller B, Faber J, et al. Digital Gait Biomarkers Allow to Capture 1-Year Longitudinal Change in Spinocerebellar Ataxia Type 3. Mov Disord. 2022;37(11):2295–301.
[7] Rochester L, Galna B, Lord S, Mhiripiri D, Eglon G, Chinnery PF. Gait impairment precedes clinical symptoms in spinocerebellar ataxia type 6. Mov Disord. 2014;29(2):252–5.
[8] Comprehensive Gait and Balance Analysis - APDM Wearable Technologies [Internet]. APDM. 2020 [cited 2024 Mar 13]. Available from: https://apdm.com/mobility/
[9] Reilmann R, Schubert R. Chapter 18 - Motor outcome measures in Huntington disease clinical trials. In: Feigin AS, Anderson KE, editors. Handbook of Clinical Neurology. Elsevier; 2017 [cited 2024 Mar 14]. p. 209–25. (Huntington Disease; vol. 144).
[10] Hamdan A, Hooker AC, Chen X, et al. Item Response Theory Analysis of the Scale for the Assessment and Rating of Ataxia in Autosomal Recessive Cerebellar Ataxias. In PAGE. Abstr 10626. Available from: www.page-meeting.org/?abstract=10626
[11] PROSPAX [Internet]. [cited 2024 Mar 13]. Home. Available from: https://www.prospax.net/
[12] Beal SL, Sheiner LB, Boeckmann A, Bauer RJ. NONMEM user’s guides (1989–2009). Ellicott City: Icon Development Solutions; 2009.