2023 - A Coruña - Spain

PAGE 2023: Methodology - New Modelling Approaches
Sebastian Micluța-Câmpeanu

Automatic model discovery in Quantitative Systems Biology

Sebastian Micluta-Câmpeanu (1), Elisabeth Roesch (1), Paul Lang (1), Chris Rackauckas (1,2,3)

JuliaHub (1), Pumas-AI (2), Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology (3)

Introduction/Objectives: Quantitative Systems Pharmacology (QSP) models aim at offering a detailed mechanistic understanding of the biological processes underlying health, disease and treatment. However, due to the complexity of human biology, it is very challenging to design mechanistic models that capture all relevant biological processes. This often leads to unsatisfactory agreements between data and simulations. To overcome this limitation, modelers can enhance mechanistic models with small neural networks that learn the missing biology, leading to improvements in goodness of fit. However, the missing biology learned by the neural network part of the system is not directly interpretable [1, 2].

With the automated model discovery feature in our cloud compute offering, PumasQSP, we not only aim to mix neural networks with the differential equations to improve goodness of fit, but also to symbolically interpret the function learned by the neural network part of the system. Furthermore, this symbolic interpretation shall add structure to the system and thus increase predictive power.

Methods:

  • Starting with a simple Lotka-Volterra system, we generate a noisy synthetic dataset.
  • Using Sparse Identification of Nonlinear Dynamics (SINDy) [3] we extract symbolic equations from the data.
  • Additionally, we remove the interaction terms from the Lotka-Volterra equations to simulate partial knowledge of the system.
  • To recover missing dynamics, we add a neural network term to the system in place of the removed interaction terms, yielding a system of so-called universal differential equations (UDEs) [1].
  • We train the UDE system to match the synthetic data. The training was performed in two stages, using the ADAM optimizer to find a good starting point for the LBFGS optimizer.
  • Using SINDy we extract symbolic equations from the function the neural network part of the UDE has learned.
  • We compare the predictive power of the UDE model with the model containing the recovered symbolic terms on a hold-out dataset.

Results:

  • SINDy was not able to recover the correct set of Lotka-Volterra equations from synthetic data.
  • We show that simulation of the trained UDE system accurately matched the data, indicating that the neural network part has learned a function that is sufficiently similar to the missing terms.
  • We demonstrate that the symbolic terms recovered by SINDy contain the missing terms from the function learned by the neural network part of the UDE.
  • Finally, after recombining the symbolic SINDy terms with the ODE part of the UDE system, and re-optimizing the kinetic parameters, we correctly recover the original interactions terms.
  • Additionally, by optimizing the parameters in the recovered dynamics, the parameters corresponding to unneeded terms go to zero, while the ones corresponding to the true dynamics converge to the true values.
  • We demonstrate that if we apply SYNDy to the prediction from the neural network embedded in the UDE, we can recover the correct interaction terms.
  • We observed improved predictive power on the model with the recovered interaction terms compared to the UDE model.

Conclusion:

  • In summary, we found that scientific machine learning techniques, such as UDEs and SINDy can be combined to automatically discover missing biology from data.
  • The above methodologies are implemented in PumasQSP to provide users with a simple interface for automated model discovery for quantitative systems pharmacology models.



References:
[1] C. Rackauckas et al., ‘Universal Differential Equations for Scientific Machine Learning’, arXiv:2001.04385 [cs, math, q-bio, stat], 2020

[2] R. Dandekar, C. Rackauckas, and G. Barbastathis, “A Machine Learning-Aided Global Diagnostic and Comparative Tool to Assess Effect of Quarantine Control in COVID-19 Spread,” Patterns, vol. 1, no. 9, p. 100145, 2020

[3] S. L. Brunton, J. L. Proctor, and J. N. Kutz, “Discovering governing equations from data by sparse identification of nonlinear dynamical systems,” Proceedings of the National Academy of Sciences, vol. 113, no. 15, pp. 3932–3937, 2016


Reference: PAGE 31 (2023) Abstr 10706 [www.page-meeting.org/?abstract=10706]
Poster: Methodology - New Modelling Approaches
Top