MSToolkit – An R library for simulating and evaluating clinical trial designs and scenarios
Mike K Smith (1), Richard Pugh (2), Romain Francois (2)
(1) Pfizer; (2) Mango Solutions
Objectives: The ability to quickly and easily simulate clinical trial scenarios is a vital tool in the modern statisticians’ inventory. Advances in clinical trial design mean that sample sizing and evaluation of trial performance metrics and operating characteristics often require simulation.
Methods: MSToolkit is an R package that allows the user to quickly simulate clinical trial data and then apply analytical methods to this simulated data to evaluate trial performance. MSToolkit can simulate endpoint and longitudinal data, crossover trials, run-in periods, interim analyses including designs allowing stopping for futility or dropping doses. Model parameters and covariate values can be generated from distributions or sampled from external files. Data is output in CSV files which are easily read and used by a number of different analytical programs. Analysis is performed on each dataset and can be performed in R, SAS or any other program which can be called in batch mode e.g. WinBUGS. MSToolkit is GRID aware – analysis can be split across multiple GRID nodes allowing parallelisation of the analysis. Results are automatically collated and summarised.
Results: MSToolkit provides flexibility in specifying trial design components coupled with the ability to analyse data using a variety of analytical engines using the power of parallel processing in a GRID environment. It also provides a common framework for simulations allowing statisticians to share code and quickly understand the mechanisms and analytical techniques used by others in simulations. Separating data generation and analysis steps means that different analytical techniques can be applied to the same underlying data in order to compare results and operating characteristics of decision criteria. Users can quickly compare designs with and without interim analysis decision rules. Users specify their own functions for generating response variables, analysing the data, interim criteria for dropping doses or stopping the study. The only limitation is the ability to specify these in R code.
Conclusion: MSToolkit allows the statistician to concentrate on the design itself, the processes that will generate data, the analytical techniques to be used, the decision criteria for attributing success or failure to a particular trial, rather than spending time coding.