Data-informed model reduction for inference and prediction from non-identifiable models
Matthew Simpson (Queensland University of Technology)
Many mathematical models in the field of theoretical biology involve challenges relating to parameter identifiability. Non-identifiability implies that different combinations of parameter values lead to indistinguishable solutions of the mathematical model. This means that it is difficult, and sometimes impossible, to explain the mechanistic origin of observations using a non-identifiable mathematical model. A standard approach to deal with structurally non-identifiable models is to use reparameterisation, which typically focuses on the structure of the mathematical model without accounting for the impact of noisy, finite data.
In this workshop we will explore a simple computational approach for model reduction, via likelihood reparameterisation, that can be applied to both structurally non-identifiable and practically non-identifiable problems. We will construct simplified, identifiable mathematical models that enable model-based predictions for a range of continuum models based on different classes of commonly-used differential equations. Through a series of computational experiments, we will illustrate how to deal with a range of noise models that relate the solution of the mathematical model with noisy observations. A key focus is to illustrate how computationally efficient model-based predictions can be made from reduced models.
Surrogate Modeling for Scalable Uncertainty Quantification in ABMs of Cancer Growth
Daniel Bergman (University of Maryland), Harsh Jain (University of Minnesota Duluth)
Agent-based models (ABMs) have become an indispensable tool for studying cancer growth and treatment response, as they naturally capture the complexity, heterogeneity, and spatial interactions characteristic of tumors. However, the computational costs associated with realistic ABM simulations often make uncertainty quantification (UQ)—including global sensitivity analysis (GSA) and inverse UQ—prohibitively expensive.
To overcome these computational challenges, our working group will explore the use of surrogate modeling techniques. Surrogate models serve as computationally efficient proxies for the original ABM, enabling rapid evaluation and thus facilitating extensive uncertainty analyses. We propose specifically to evaluate and compare direct UQ approaches, machine learning-based surrogate methods, and our recently developed method, Surrogate Modeling for Recapitulating Global Sensitivity (SMoRe GloS). SMoRe GloS explicitly preserves biological interpretability while substantially reducing computational burden.
Our initial focus will be on global sensitivity analysis (GSA). However, methods developed through this working group will be broadly applicable to other areas of UQ, such as parameter estimation and inverse modeling. We will use the open-source PhysiCell platform—a flexible, C++-based modeling framework widely adopted in cancer modeling—as our primary testbed. Developed with outreach and education as core missions alongside power, extensibility, and cross-platform compatibility, PhysiCell is readily accessible even to participants new to ABMs. Furthermore, a GUI interface enables complex modeling without C++ coding. A PhysiCell core developer will also be on the team to help members get set up and answer questions. Additionally, we encourage participants to bring their own ABMs or biological applications to the group, enriching the discussion and providing diverse case studies for method evaluation.
Connections between identifiability and uncertainty quantification (in dynamic systems biology modeling)
Alejandro Villaverde (Universidade de Vigo)
The proposed research topic concerns the relationship between the analysis of (structural) identifiability and observability (SIO), on the one hand, and uncertainty quantification (UQ), on the other. Both concepts are related to uncertainty, but in different ways. The SIO properties are fully determined by the model equations, they do not depend on the data [Wieland et al., 2021; Villaverde, 2019]. UQ, on the other hand, is strongly dependent on the available data. Some authors (e.g., [Norden et al.]) have argued that SIO is related to epistemic uncertainty and UQ to aleatoric uncertainty. SIO may be analyzed with symbolic approaches such as differential algebra [Dong et al., 2023] and differential geometry [Díaz-Seoane et al., 2023]. Another possibility is to use simulation-based numerical approaches via sensitivities [van Willigenburg et al. 2022, Joubert et al. 2020] or optimization, using the profile likelihood approach (PL) [Kreutz et al., 2013].
In this project we propose to explore the links between SIO and UQ in several directions.
As modeling framework we will consider dynamic models of deterministic systems in continuous time (and their corresponding hybrid extensions, in the case of item III). Most work in this area has been done in ordinary differential equations (ODEs), although many concepts and ideas can also be applied to partial differential equation systems (PDEs). The range of possible biological applications is very wide, including essentially every biosystem or bioprocess that can be modelled in this way (see, e.g., the recent/current special issue of the Philos. Trans. R. Soc. A).
Quantitative Systems Pharmacology virtual populations simulations for oncology drug development: evaluating the problem space through the lens of uncertainty quantification
Blerta Shtylla (Pfizer)
Simulation and analysis of virtual populations are increasingly being used to guide decisions in drug development within the context of model informed drug development (MIDD). While there are a multitude of algorithmic approaches for selecting virtual populations, most rely on smart sampling of model parameter values to allow a mechanistic quantitative systems pharmacology (QSP) model to match the diverse responses observed in real clinical populations [1,2]. We will review key existing methodologies and computational approaches to address specific challenges in solid-tumor oncology for matching tumor size and progression free survival data. We survey in more detail a recently published case study from our group focusing on anaplastic lymphoma kinase inhibitors (ALKi), which have shown great promise in circumventing on-target resistance mutations in ALK+ non-small cell lung cancer (NSCLC) [3]. A virtual population strategy was outlined that focused on capturing the range of resistance to ALKi therapies that emerge. Comprehensive sampling of potential resistance mechanisms was essential for understanding clinical response to current therapies.
Using the case study recently published in [3], I will outline some open challenges for improving the efficiency and quality of virtual populations. For efficiency, we will be interested in exploring methods that extend approaches outlined in [2] with an eye on whether computational efficiency can be increased through integrated use of parameter identifiability sampling algorithms. The goal is to improve how large dimensional parameter spaces can be sampled both in the context of computational efficiency as well as virtual patient heterogeneity. Questions regarding quality are related to, for example, how many virtual patients are needed to fully capture the potential mechanistic variability implied by the model and the data at hand. Answering these questions requires metrics that account for the interaction between prior parameter bounds, non-identifiability, and population variability within the sampling procedure.
Modeling immune mechanisms of inflammatory exacerbations
Marissa Renardy (GSK)
Chronic obstructive pulmonary disease (COPD) is a chronic condition caused by damage to the lungs, which leads to inflammation and other problems that obstruct airflow. COPD affects more than 14 million adults in the US. COPD includes both emphysema and chronic bronchitis, and people with COPD often have a mixture of both. The repeated and progressive activation of immune cells plays a pivotal role in the pathogenesis of COPD [1]. A COPD exacerbation is an acute worsening of respiratory symptoms that may last for days or weeks, often triggered by a viral or bacterial infection or exposure to pollutants, which can leave behind permanent lung damage and accelerate disease progression [2]. Thus, decreasing exacerbations is a key goal of therapeutics. Due to the heterogeneous and random nature of exacerbations, predicting exacerbation frequency and severity is difficult. Quantitative approaches include machine learning [3], regression models [4], biomechanical models [5,6], and mechanistic immunological models [7].
The goal of this project is to develop a methodology for simulating inflammatory exacerbations in COPD, to be applied to a mechanistic model of immune interactions in order to predict or understand (1) what features are predictive of exacerbation frequency/severity, and (2) what immunological mechanisms are most effective to target (and under what conditions) to reduce exacerbation frequency/severity. Uncertainty quantification approaches will be critical in capturing the impact of patient heterogeneity and randomness of exacerbations.
Uncertainty Quantification in Physiological Models
Mette Olufsen (North Carolina State University), Mitchel Colebank (University of South Carolina)
Uncertainty quantification (UQ) is an essential analysis tool for employing and calibrating mathematical models to clinical data, yet many modeling studies do not address uncertainty in measurements, model structure, or parameter estimates. Often, data measured include a combination of static, spatial, and timeseries data, some of which are often incomplete. Gaining more insight into how to assess uncertainty is essential for the generation of cardiovascular digital twins and using models to extract biomarkers that can be essential for diagnosis and treatment planning. We will discuss three modeling applications: (1) closed-loop cardiovascular compartment (ODE) models informed by different data modalities, (2) distributed one-dimensional (1D) fluid dynamic network models informed in part by medical imaging geometric data, and (3) multiscale/multisystem models which combines multiple spatial or temporal scales (e.g., growth and remodeling over months) or multiple physiologic systems (e.g., autonomic control or inflammatory signaling)