Loading...
Loading participant list in background...
Monday, January 12, 2026
Check In
11th Floor Collaborative Space
Welcome
11th Floor Lecture Hall
Brendan Hassett, ICERM/Brown University
Bayesian Varying-Effects Regression Models via Gaussian Process Priors
11th Floor Lecture Hall
Speaker
Marina Vannucci, Rice University
Session Chair
Yanxun Xu, Johns Hopkins University
Abstract
Traditional regression models typically assume linear relationships among predictors and response variables, often neglecting complex dependencies and variations across different observations or variables. However, in numerous real-world scenarios, predictors exhibit structured interactions, and their effects on an outcome variable may vary depending on additional covariates. I will introduce novel Bayesian varying-coefficients regression frameworks that employ Gaussian process priors to model non-linear predictor effects and variable selection priors to simultaneously select important predictors and modulating covariates. I will consider extensions to longitudinal data that use two-dimensional Gaussian processes to capture both time-evolving predictor effects and the influence of the covariates on these effects. Using simulation studies, I will show that integrating network information into feature selection improves power to detect the true predictors, also outperforming regularization. I will also show applications to microbiome data.
Clustering Spatial Transcriptomics Data with Dirichlet Process Mixture of Random Spanning Trees
11th Floor Lecture Hall
Speaker
Yang Ni, Texas A&M University
Session Chair
Yanxun Xu, Johns Hopkins University
Abstract
Spatial transcriptomics has gained tremendous popularity as it allows researchers to map gene expression directly onto tissue architecture, preserving spatial context and providing high-resolution insights into cellular interactions and biological processes within their native environments. In this paper, we introduce a novel Bayesian nonparametric framework, Dirichlet process mixture of random spanning trees (DP-RST), designed to detect an unknown number of possibly non-convex clusters in possibly non-convex spatial domains. The model’s two-layer partitioning effectively addresses challenges posed by the intricate spatial organization of tissue samples, such as non-convex clusters and irregular spatial boundaries of the samples. Through simulation studies, DP-RST demonstrates superior clustering accuracy compared to existing methods. We apply DP-RST to our motivating mouse colonic dataset during healing from inflammatory damage, revealing meaningful clusters associated with different stages of tissue repair. Differential gene expression analysis highlights key genes with spatially distinct patterns, revealing the compartmentalization of immune, metabolic, and regenerative processes during mucosal healing. To demonstrate the broad applicability of DP-RST, we analyze four
additional spatial transcriptomics datasets generated by the 10x Visium platform. Supplementary materials, including extended results and the code that implements our method, are available online.
Coffee Break
11th Floor Collaborative Space
Poisson process factorization for mutational signature analysis with genomic covariates
11th Floor Lecture Hall
Speaker
Jeffrey Miller, Harvard T.H. Chan School of Public Health
Session Chair
Peter Mueller, University of Texas at Austin
Abstract
Mutational signatures analysis is a powerful technique for uncovering the mutational processes involved in cancer. Current approaches are based on non-negative matrix factorization (NMF), however, this ignores the non-homogeneous occurrence of mutations across the genome. We introduce a flexible new Bayesian method using Poisson point processes to model the activities of mutational signatures as they vary across the genome. Using covariate-dependent factorized intensity functions, our Poisson process factorization (PPF) model generalizes the standard NMF model to include regression coefficients that capture the effect of genomic features on the mutation rates from each latent process. Furthermore, our method employs sparsity-inducing hierarchical priors to automatically infer the number of active latent factors in the data. We present algorithms to obtain maximum a posteriori estimates and uncertainty quantification via Markov chain Monte Carlo. We demonstrate the method on simulated data and on real data from breast cancer.
Bridging AI and BNP for Layered Point Pattern Data Analysis
11th Floor Lecture Hall
Speaker
Qiwei Li, The University of Texas at Dallas
Session Chair
Peter Mueller, University of Texas at Austin
Abstract
Tissues are composed of cells organized into specialized compartments that, together with the extracellular matrix, form complex architectures. In the oral epithelium, basal cells divide and differentiate as they migrate toward the surface, forming stratified layers whose number and organization serve as key diagnostic markers in oral premalignant disorders (OPMD). Characterizing tissue morphological architecture through the spatial arrangement of diverse cell types provides critical insights into disease initiation, progression, and therapeutic response.
In this talk, we present a computational framework that bridges artificial intelligence (AI) and statistical modeling to infer layered cellular structures directly from oral tissue pathology images. Recent advances in AI enable automated segmentation and classification of millions of cell nuclei, producing rich spatial point pattern data at unprecedented scale. However, existing statistical tools lack a principled framework for inference on layered point patterns. To address this gap, we developed a Bayesian nonparametric (BNP) hierarchical model that formally tests the existence of a layered structure and, when present, estimates the number of layers while quantifying the associated uncertainty. Our approach employs a mixture of finite mixtures (MFM) framework to assign cells to ordered layers without predefining their number, and a generalized Beta distribution to effectively characterize cell nuclear shape.
Simulation studies demonstrate that our method surpasses existing clustering benchmarks, achieves linear scalability with respect to cell count, and maintains high accuracy even when using only 5% of nuclear pixels, highlighting its robustness and computational efficiency for large-scale pathology data. In an analysis of 128 oral OPMD patients from UT MDACC, we further establish a significant clinical association between the estimated epithelial layer count and dysplasia severity (p = 0.001).
Truncated Posterior Inference for Bayesian Nonparametrics
11th Floor Lecture Hall
Speaker
Trevor Campbell, The University of British Columbia
Session Chair
Igor Pruenster, Bocconi University
Abstract
Bayesian nonparametric (BNP) models involve an infinite latent collection of parameters, which enables observations to reflect a growing number of parameters as more data are collected. But while these infinitely many parameters provide significant flexibility, they result in very challenging computational problems for posterior inference methods. One approach involves approximating the nonparametric model with a parametric one (a truncation), and subsequently applying a standard inference algorithm. While this approach is practical, parametric truncation leads to unknown posterior approximation error. In this talk, I will introduce a new technique for posterior inference with general truncated completely random measure priors that includes estimates of the posterior truncation error. Applications include models where feature assignment variables are observed (e.g., edge-exchangeable network models) and unobserved (e.g., latent feature assignment models).
Coffee Break
11th Floor Collaborative Space
Scaling Up Bayesian Neural Networks with Neural Networks
11th Floor Lecture Hall
Speaker
Babak Shahbaba, University of California - Irvine
Session Chair
Noirrit Chandra, University of Texas at Dallas
Abstract
Bayesian Neural Networks (BNNs) offer a principled and natural framework for proper uncertainty quantification in the context of deep learning. They address the typical challenges associated with conventional deep learning methods, such as data insatiability, ad-hoc nature, and susceptibility to overfitting. However, their implementation typically either relies on Markov chain Monte Carlo (MCMC) methods, which are characterized by their computational intensity and inefficiency in a high-dimensional space, or variational inference methods, which tend to underestimate uncertainty. To address this issue, we propose a novel Calibration-Emulation-Sampling (CES) strategy to significantly enhance the computational efficiency of BNN. In this framework, during the initial calibration stage, we collect a small set of samples from the parameter space. These samples serve as training data for the emulator, which approximates the map between parameters and posterior probability. The trained emulator is then used for sampling from the posterior distribution at substantially higher speed compared to the standard BNN. Using simulated and real data, we demonstrate that our proposed method improves computational efficiency of BNN, while maintaining similar performance in terms of prediction accuracy and uncertainty quantification.
Exact Gibbs sampling for SDEs with unit diffusion coefficient
11th Floor Lecture Hall
Speaker
VInayak Rao, Purdue University
Session Chair
Noirrit Chandra, University of Texas at Dallas
Abstract
Stochastic differential equations (SDEs) are an important class of time-series models, used to describe systems evolving stochastically in continuous time. Simulating paths from these processes, particularly after conditioning on noisy observations of the latent path, however remains challenging. Existing methods often introduce bias through time-discretization, involve complicated rejection sampling schemes or are restricted to a narrow family
of diffusions (Wang et al. (2020), constraining their applicability. In this work, we propose an exact Markov chain Monte Carlo (MCMC) sampling algorithm that broadens the applicability of Wang et al. (2020). Building on the Gibbs sampler framework from that paper, we now allow exact MCMC for diffusions belonging to the so-called classes EA2 and EA3. Our methodology is thus applicable to essentially any SDE with unit diffusion coefficient (and through a variance-stabilizing transform, to essentially any 1-d SDE). We demonstrate how our MCMC methodology allows us to order computations to need only fairly straightforward simulation steps. Our framework also allows tools from the Gaussian process literature to be straightforwardly applied. We evaluate our method on both synthetic and real datasets, demonstrating superior performance compared to a number of baselines.
Reception
11th Floor Collaborative Space
Tuesday, January 13, 2026
Beyond Schrödinger Bridges for Learning Trajectories from Snapshots
11th Floor Lecture Hall
Speaker
Renato Berlinghieri, MIT
Session Chair
Igor Pruenster, Bocconi University
Abstract
Inferring and forecasting latent stochastic dynamics (trajectories) from snapshot data (measurements at different time points across a population, but only one time point per subject) is a critical challenge in areas like single-cell biology. Existing deep learning methods, often based on Schrödinger bridges (SBs), are limited: they either interpolate between only two time points, failing to capture long-term dependencies, or require a pre-set, fixed reference dynamic.
We introduce two novel frameworks to overcome these limitations. The first method successfully learns trajectories from multiple time points using only a family of reference dynamics, enhancing trajectory reconstruction. The second, SnapMMD, uses a maximum mean discrepancy (MMD) loss to directly fit the joint state-time distribution. Crucially, SnapMMD allows us to infer unknown and state-dependent volatilities from the data, which is essential for accurate forecasting beyond the observed time horizon. We demonstrate that both approaches significantly improve upon state-of-the-art methods in trajectory inference, velocity-field reconstruction, and forecasting across a variety of real and synthetic experiments.
Subjective Exchangeable Partition Priors via Integer Partitions
11th Floor Lecture Hall
Speaker
David Dahl, Brigham Young University
Session Chair
Tamara Broderick, Massachusetts Institute of Technology
Abstract
We introduce a class of exchangeable random partition models that allow direct specification of prior beliefs about both the number of clusters and the distribution of cluster sizes. In contrast to random partitions induced by Bayesian nonparametric priors---such as the Dirichlet-process Chinese restaurant process and its Pitman-Yor generalization, where a small number of global parameters jointly determine both the number of clusters and the size distribution---our framework places an explicit prior on the number of clusters and a separate prior on the cluster-size profile. We use Lorenz curves and introduce a family of distributions on integer partitions that accommodate exponential- or power-law tails. Conditional on an integer partition, we place a uniform distribution over all set partitions consistent with it, preserving exchangeability and separating prior elicitation from label combinatorics. This yields exchangeable priors in which the number of clusters and the size profile are specified separately. We treat n as fixed and do not require sampling consistency across n. Computation reduces to repeated evaluation and sampling from bounded-support univariate pmfs with closed-form normalizing constants.
Coffee Break
11th Floor Collaborative Space
Completely random measures on the non-negative orthant: Bayesian nonparametric priors for multiple populations
11th Floor Lecture Hall
Session Chair
Peter Orbanz, Columbia University
Abstract
Completely random measures (CRMs) have been well-studied as convenient priors for Bayesian nonparametrics (BNP). It is common to define a CRM prior on a countably infinite set of rates via a Poisson point process with rate measure on the non-negative real line. This prior is often paired with a count likelihood. We consider the case where the rate measure is instead over non-negative orthant, which can be interpreted as generating a vector of rates in D dimensions. For instance, these could represent the rates of genetic variants in D populations, and might be paired with a Bernoulli likelihood for a full generative model of genetic variants. We show that, surprisingly, the choice of a rate measure that factorizes across dimensions fails to satisfy natural BNP desiderata: roughly, that each sample is finite but that there are always more features to discover in each dimension. We propose an alternative construction that satisfies these desiderata while maintaining exponential conjugacy. We develop tools to characterize the behavior of the number of observed features as the sample size grows across dimensions. And we provide conditions that dictate realistic power-law growth.
Hierarchical Random Measures without Tables
11th Floor Lecture Hall
Speaker
Marta Catalano, Luiss University
Session Chair
Peter Orbanz, Columbia University
Abstract
Bayesian multilevel models provide an effective framework to borrow information between different data sources through the sharing of common features. In a nonparametric setting, a classic example is the hierarchical Dirichlet process, whose generative model can be described through a set of latent variables, commonly referred to as tables in the popular restaurant franchise metaphor. The latent tables greatly simplify the expression of the posterior and allow for the implementation of a Gibbs sampling algorithm to approximately draw samples from it. However, managing their assignments can become computationally expensive, especially as the size of the dataset and of the number of levels increase. In this talk, we identify a prior for the concentration parameter of the hierarchical Dirichlet process that (i) induces a quasi-conjugate posterior distribution, and (ii) removes the need of tables, bringing to more interpretable expressions for the posterior, with both a scalable and an exact algorithm to sample from it. This construction extends beyond the Dirichlet process, leading to a new framework for defining normalized hierarchical random measures and a new class of algorithms to sample from their posteriors. This is joint work with Claudio Del Sole (Bicocca University).
Bayesian Mixture Models for Histograms with Applications to Large Datasets
11th Floor Lecture Hall
Speaker
Fernando Quintana, Pontifical Catholic University of Chile
Session Chair
Marina Vannucci, Rice University
Abstract
In many real-world scenarios, especially those involving privacy constraints or data summarization, data are available only in aggregated forms such as histograms or frequency tables. This work introduces a novel Bayesian method for inferring the underlying population distribution by fitting a mixture model to binned data. While we focus on mixtures of normal distributions, the framework is flexible, and can be extended to other distributional families. We place a prior on the number of mixture components, accommodating both finite and countably infinite mixtures, and perform inference using reversible jump MCMC. The proposed approach demonstrates strong performance on large-scale data, showcasing the potential of nonparametric Bayesian modeling in practical applications. Furthermore, we extend the method to model multiple histograms simultaneously and clustering them using the Dirichlet process. This enables information sharing across populations and provides a principled posterior probability for assessing homogeneity between groups. Some theoretical results supporting performance of the proposed methodology are also discussed.
Coffee Break
11th Floor Collaborative Space
Conformalized Bayesian Inference, with Applications to Random Partition Models
Lightning Talks - 11th Floor Lecture Hall
Speaker
Nicola Bariletto, University of Texas at Austin
Session Chair
Ramses Mena, National Autonomous University of Mexico (UNAM)
Abstract
Bayesian posterior distributions naturally represent parameter uncertainty informed by data. However, when the parameter space is complex, as in many nonparametric settings where it is infinite-dimensional or combinatorially large, standard summaries such as posterior means, credible intervals, or simple notions of multimodality are often unavailable, hindering interpretable posterior uncertainty quantification. We introduce Conformalized Bayesian Inference (CBI), a broadly applicable and computationally efficient framework for posterior inference on nonstandard parameter spaces. CBI yields a point estimate, a credible region with assumption-free posterior coverage guarantees, and a principled analysis of posterior multimodality, requiring only Monte Carlo samples from the posterior and a notion of discrepancy between parameters. The method builds a pseudo-density score for each parameter value, yielding a MAP-like point estimate and a credible region derived from conformal prediction principles. The key conceptual step underlying this construction is the reinterpretation of posterior inference as prediction on the parameter space. A final density-based clustering step identifies representative posterior modes. We investigate a number of theoretical and methodological properties of CBI and demonstrate its practicality, scalability, and versatility in simulated and real data clustering applications with random partition models.
Bayesian model criticism using uniform parametrization checks
Lightning Talks - 11th Floor Lecture Hall
Speaker
Christian Covington, Harvard University
Session Chair
Ramses Mena, National Autonomous University of Mexico (UNAM)
Abstract
Models are often misspecified in practice, making model criticism a key part of Bayesian analysis. It is important to detect not only when a model is wrong, but which aspects are wrong, and to do so in a computationally convenient and statistically rigorous way. We introduce a novel method for model criticism based on the fact that if the parameters are drawn from the prior, and the dataset is generated according to the assumed likelihood, then a sample from the posterior will be distributed according to the prior. Thus, departures from the assumed likelihood or prior can be detected by testing whether a posterior sample could plausibly have been generated by the prior. Building upon this idea, we propose to reparametrize all random elements of the likelihood and prior in terms of independent uniform random variables, or u-values. This makes it possible to aggregate across arbitrary subsets of the u-values for data points and parameters to test for model departures using classical hypothesis tests for dependence or non-uniformity. We demonstrate empirically how this method of uniform parametrization checks (UPCs) facilitates model criticism in several examples, and we develop supporting theoretical results.
Bayesian Nonparametrics for Causal Inference under multiple treatments
Lightning Talks - 11th Floor Lecture Hall
Speaker
Sebastiano Bianchi, Università Bocconi
Session Chair
Ramses Mena, National Autonomous University of Mexico (UNAM)
Abstract
We propose a Bayesian nonparametric approach for estimating the heterogeneous treatment effects (HTE) in the context of causal inference problems under multiple treatments, in particular settings with a placebo group and K active treatments, K>=2. Upon a slight manipulation of the input data, we regress nonparametrically the outcome of interest on the covariates, the regression function being the HTE. We incorporate the available information concerning the presence of placebo subjects, for whom we expect a null HTE, into the prior elicitation by considering a Pitman-Yor process with spike-and-slab baseline measure defined as a linear combination of a point mass and a diffuse Gaussian Process (GP) on a suitable space of functions. Moreover, we enforce matching on the generalised propensity scores (GPS) directly at the level of the GP covariance function thanks to a newly-defined stationary kernel.
What’s the Pattern: A Bayesian Nonparametric Fusion of Feature Allocation and Partition Models for Joint Inference on Patient Comorbidity Dynamics and Cognitive Aging
Lightning Talks - 11th Floor Lecture Hall
Speaker
Arhit Chakrabarti, Texas A&M University
Session Chair
Ramses Mena, National Autonomous University of Mexico (UNAM)
Abstract
A critical step in cognitive aging research is the identification of subtypes of longitudinal cognitive trajectories. Characterizing such heterogeneity is essential for understanding the mechanisms underlying cognitive decline and for designing personalized strategies to delay or prevent cognitive impairment in later life. In real-world settings, individuals frequently experience multiple comorbid conditions such as cardiovascular disease, metabolic disorders, or mental health conditions that may arise at different stages across the life course. The timing of onset and progression of these comorbidities can substantially modulate patterns of cognitive change, leading to marked variability in both the rate and form of cognitive decline. Consequently, the simultaneous identification of comorbidity-related age effects and subtypes of cognitive trajectories is crucial for accurate risk stratification and targeted intervention. In this paper, we propose a novel Bayesian nonparametric approach integrating feature allocation and clustering to detect the underlying latent features related to patients’ age onset of comorbid conditions that can possibly aid in identifying relevant patterns in the prognosis of their cognitive outcomes. We illustrate our proposed model with extensive simulations and an application to the motivational dataset. Our approach identifies seven latent comorbidity-onset features and eleven distinct subtypes of longitudinal cognitive trajectories, highlighting the model’s ability to capture complex heterogeneity in cognitive aging.
Bayesian Multiple Multivariate Density-Density Regression
Lightning Talks - 11th Floor Lecture Hall
Speaker
Khai Nguyen, University of Texas at Austin
Session Chair
Ramses Mena, National Autonomous University of Mexico (UNAM)
Abstract
We propose the first approach for multiple multivariate density–density regression (MDDR), enabling the regression of a multivariate density–valued response on multiple multivariate density–valued predictors. The core idea is to define a fitted distribution using a sliced Wasserstein barycenter (SWB) of push-forwards of the predictors and to quantify deviations from the observed response using the sliced Wasserstein (SW) distance. Regression functions, which map predictors’ supports to the response support, and barycenter weights are inferred within a generalized Bayes framework, enabling principled uncertainty quantification without requiring a fully specified likelihood. The inference process can be seen as an instance of an inverse SWB problem. We establish theoretical guarantees, including the stability of the SWB under perturbations of marginals and barycenter weights, sample complexity of the generalized likelihood, and posterior consistency. For practical inference, we introduce a differentiable approximation of the SWB and a smooth reparameterization to handle the simplex constraint on barycenter weights, allowing efficient gradient-based MCMC sampling. We demonstrate MDDR in an application to inference for population- scale single-cell data. Posterior analysis under the MDDR model in this example includes inference on communication between multiple source/sender cell types and a target/receiver cell type. The proposed approach provides accurate fits, reliable predictions, and interpretable posterior estimates of barycenter weights, which can be used to construct sparse cell-cell communication networks.
Bayesian Multi-View Clustering of Computer Mouse Tracking and ERP Data via a Joint Random Partition Model
Lightning Talks - 11th Floor Lecture Hall
Speaker
Ziyi Song, University of California, Irvine
Session Chair
Ramses Mena, National Autonomous University of Mexico (UNAM)
Abstract
Neurobehavioral studies increasingly collect multiple data modalities on the same subjects, such as behavioral trajectories and neurocognitive measurements, to better understand individual decision-making processes. Identifying subject subgroups from such multi-view data is challenging because clustering structures could be related across modalities but not identical. We develop a Bayesian multi-view clustering framework that jointly analyzes computer mouse-tracking trajectories and event-related potential (ERP) wave forms while allowing modality-specific clustering patterns. Our approach is based on a joint random partition prior that induces dependence between view-specific subject partitions through a penalty on their dissimilarity. This construction encourages aligned clustering across data views without forcing exact agreement, providing a flexible compromise between fully shared and fully independent partitions. The degree of cross-view dependence is governed by an interpretable penalty parameter, for which we conduct fully Bayesian inference. Posterior computation is carried out using Markov chain Monte Carlo methods incorporating split–merge updates tailored to the multi-view setting and an exchange algorithm to address the doubly intractable normalizing constants. We apply the proposed method to data from an intervention study of addicted smokers, jointly analyzing their computer mouse-tracking trajectories of movement dynamics and ERP waveforms of neuroaffective processing. The resulting clusters reveal distinct subject subgroups characterized by differing responses to cigarette-related cues, highlighting both concordance and divergence between behavioral and neural responses of cue reactivity. The proposed framework is broadly applicable to multi-modal clustering problems involving dependent but non-identical partitions across multiple data views.
Wednesday, January 14, 2026
Recursive estimation for mixtures or why zooming out is a good idea
11th Floor Lecture Hall
Speaker
Bernardo Flores, The University of Texas at Austin
Session Chair
Jeffrey Miller, Harvard T.H. Chan School of Public Health
Abstract
Bayesian nonparametric mixture models provide a flexible framework for data analysis but are often hindered by the computational expense of traditional inference methods like MCMC. A fast, recursive algorithm proposed by Newton (2002) offers a practical alternative, yet its formal connection to Bayesian inference and its theoretical properties remain only partially understood. This paper reveals a new geometric interpretation of this classic method. We demonstrate that Newton's recursion is a discrete-time approximation of a gradient flow on the space of probability measures, governed by the Hellinger geometry. This perspective not only provides a principled theoretical foundation for the algorithm but also allows us to generalize it. By framing estimation as the minimization of an energy functional on a statistical manifold, we derive a new family of algorithms by modifying the underlying geometry and discretization. Applications include bootstraping, dependent and repulsive mixtures.
Fast non-reversible samplers for Bayesian mixture models
11th Floor Lecture Hall
Speaker
Filippo Ascolani, Duke University
Session Chair
Jeffrey Miller, Harvard T.H. Chan School of Public Health
Abstract
Finite and infinite mixtures are a cornerstone of Bayesian modelling, and it is well-known that sampling from the resulting posterior distribution can be a hard task. In particular, popular reversible Markov chain Monte Carlo schemes are often slow to converge when the number of observations is large. In this paper we introduce a novel and simple non-reversible sampling scheme for Bayesian mixture models, which is shown to drastically outperform classical samplers in many scenarios of interest, especially during convergence phase and when components in the mixture have non-negligible overlap.
At the theoretical level, we show that the performance of the proposed non-reversible scheme cannot be worse than the standard one, in terms of asymptotic variance, by more than a constant factor, and we provide a scaling limit analysis suggesting that the non-reversible sampler can reduce the convergence time by an order of magnitude. We also discuss why the statistical features of mixture models make them an ideal case for the use of non-reversible discrete samplers.
Coffee Break
11th Floor Collaborative Space
Dynamic Random Partitions: Applications, Opportunities, and Challenges
11th Floor Lecture Hall
Speaker
Michele Guindani, University of California, Irvine
Session Chair
Trevor Campbell, The University of British Columbia
Abstract
Random partition models are a fundamental tool for Bayesian clustering and mixture modeling. Recent work has begun to treat the partition itself as a dynamic object, opening up new possibilities for modeling time-varying dependence structures in complex data, while also raising distinctive modeling, computational, and inferential challenges. In this talk, I will illustrate these opportunities and challenges through a few recent applications. Examples include local level dynamic random partition models for change point detection, which include a Markovian evolution of partitions within a state-space framework and couple it with non-marginal false discovery rate control; Bayesian temporal biclustering methods for multi-subject neuroscience studies, which jointly partition subjects and time into evolving profiles; and Bayesian semiparametric models that decode neuronal ensembles as spatially structured partitions of large populations of neurons from calcium imaging data. Together, these examples highlight both the promise of dynamic random partition models for representing evolving clustering structures and the open challenges on prior specification and scalable computation.
Building faster and more expressive BART models
11th Floor Lecture Hall
Speaker
Sameer Deshpande, University of Wisconsin-Madison
Session Chair
Trevor Campbell, The University of British Columbia
Abstract
Bayesian Additive Regression Trees (BART) is a highly effective nonparametric regression model that approximates unknown functions with a sum of axis-aligned binary regression trees (i.e., piecewise-constant step functions) that one-hot encode categorical predictors. Consequently, the original BART model is fundamentally limited in its ability to (i) "borrow strength" across multiple levels of a categorical predictor and (ii) exploit structural relationships between multiple categorical predictors (e.g., adjacency and nesting). I will introduce new decision rule priors that overcome these limitations and open the door to fitting non-linear multilevel models with regression tree ensembles. I will also describe a new software package that unifies several existing BART extensions and allows users to fit increasingly expressive BART models without having to implement bespoke samplers.
Group Photo (Immediately After Talk)
11th Floor Lecture Hall
Polymorphic Vectorization
11th Floor Lecture Hall
Speaker
Peter Orbanz, Columbia University
Session Chair
Michele Guindani, University of California, Irvine
Abstract
Vectorization in GPUs is a specific form of parallelization that, loosely speaking, executes the same code on different inputs. This generally makes it hard to use vector hardware to parallelize tasks that are polymorphic, in the sense that the required sequence of instructions differs between tasks. I will explain an approach to this problem that augments the state space of the executed program, and will sketch two applications: (i) To certain MCMC algorithms such as slice sampling or HMC-NUTS, where threads differ in the number of times an inner while loop must be executed. (ii) To mechanical design problems, where each thread must optimize a different part of a coupled mechanical system.
Joint work with Ryan P Adams, Joshua Aduol, Hugh Dance, Pierre Glaser, and Alex Guerra.
Coffee Break
11th Floor Collaborative Space
Clustering with shot-noise Cox Process Mixture Models
11th Floor Lecture Hall
Speaker
Federico Camerlenghi, University of Milano-Bicocca
Session Chair
Matthew Heiner, Brigham Young University
Abstract
The study of almost surely discrete random probability measures is an active line of research in Bayesian nonparametrics. The idea of assuming interaction among the atoms of a random probability measure has recently spurred significant interest in the context of Bayesian mixture models, allowing the definition of priors that encourage well-separated and interpretable clusters. In this talk, we provide a unified framework for the construction and Bayesian analysis of random probability measures with interacting atoms, encompassing both repulsive and attractive behaviors. We develop a full Bayesian analysis without making any assumptions about the finite point process that governs the atoms of the random measure.
We then focus on a clever choice of the underlying finite point process, leading to shot-noise Cox process mixture models. We show that assuming a shot-noise Cox process for the mixture locations yields tractable theory, efficient algorithms, and a novel notion of clusters that may consist of multiple mixture components with similar parameters.
We also demonstrate how this construction can be extended to cluster observations divided into groups, giving rise to the hierarchical shot-noise Cox process (HSNCP) mixture model. Previously proposed models allow for clustering across groups by sharing atoms in the group-specific mixing measures. However, exact atom sharing can be overly rigid when groups differ subtly, introducing a trade-off between clustering and density estimation and fragmenting across-group clusters. We show how the HSNCP overcomes this density-clustering trade-off. Simulation studies and a real data application showcase the usefulness of our proposal.
On a novel exact representation of species sampling processes
11th Floor Lecture Hall
Speaker
Ramses Mena, National Autonomous University of Mexico (UNAM)
Session Chair
Matthew Heiner, Brigham Young University
Abstract
We revisit species sampling priors from a computational perspective and show that they can be treated through simple representations that retain their full predictive structure. This enables posterior inference with off-the-shelf algorithms, yielding straightforward MCMC implementations and accessible expressions for predictive laws and partition distributions, without resorting to ad hoc numerical approximations.
Thursday, January 15, 2026
M-posteriors: frequentist guarantees and robustness properties
11th Floor Lecture Hall
Speaker
Marco Avella-Medina, Columbia
Session Chair
Antonio Lijoi, Bocconi University
Abstract
We provide a theoretical framework for a wide class of generalized posteriors that can be viewed as the natural Bayesian posterior counterpart of the class of M-estimators in the frequentist world. We call the members of this class M-posteriors and show that they are asymptotically normally distributed under mild conditions on the M-estimation loss and the prior. In particular, an M-posterior contracts in probability around a normal distribution centered at an M-estimator, showing frequentist consistency and suggesting some degree of robustness depending on the reference M-estimator. We formalize the robustness properties of the M-posteriors by a new characterization of the posterior influence function and a novel definition of breakdown point adapted for posterior distributions. We illustrate the wide applicability of our theory in various popular models and illustrate their empirical relevance in some numerical examples.
Anytime valid and asymptotically optimal statistical inference driven by predictive recursion
11th Floor Lecture Hall
Speaker
Vaidehi Dixit, University of Nottingham
Session Chair
Antonio Lijoi, Bocconi University
Abstract
Distinguishing two classes of candidate models is a fundamental and practically important problem in statistical inference. Error rate control is crucial to the logic but, in complex nonparametric settings, such guarantees can be difficult to achieve, especially when the stopping rule that determines the data collection process is not available. My talk is based on construction of e-processes in a Bayesian and quasi-Bayesian setting. A particular novel e-process construction by us leverages the so-called predictive recursion (PR) algorithm. The proposal is based on constructing a marginal likelihood by mixing over a specified class of distributions. Such a likelihood could be constructed in a Bayesian way by introducing a prior on the class of distributions in the alternative and finding the corresponding Bayesian marginal likelihood. But implementing a purely Bayesian strategy to account for nonparametric aspects of applications can be computationally demanding. The PR algorithm stems as an approximation to the posterior mean of the mixing distribution under the Dirichlet process prior and hence is able to rapidly and recursively fit nonparametric mixture models. The resulting PRe-process affords anytime valid inference uniformly over stopping rules and is shown to be efficient in the sense that it achieves the maximal growth rate under the alternative relative to the mixture model being fit by PR.
Coffee Break
11th Floor Collaborative Space
Scalable Slice Sampling for (Hierarchical) Dirichlet Process Mixtures
11th Floor Lecture Hall
Speaker
Beatrice Franzolini, Bocconi University
Session Chair
Tamara Broderick, Massachusetts Institute of Technology
Abstract
Markov chain Monte Carlo algorithms for Dirichlet process-based models typically rely on either marginalization or truncation, with the latter yielding approximate inference and introducing non-vanishing bias in the resulting partition structure.
Slice sampling provides a notable exception: by introducing auxiliary slice variables, it avoids marginalization and enables exact posterior inference.
However, standard slice samplers require an unbounded number of operations per iteration, making it difficult to control the computational cost and practical scalability.
To formally quantify this cost, we derive high-probability asymptotic bounds on the complexity of slice sampling in Dirichlet process mixture models, showing that, under general cluster-growth regimes, the overhead introduced by slice variables is at most an additive Op(ln n). Building on this result, we propose a novel, exact ""hybrid'' slice-sampling algorithm for posterior inference in hierarchical Dirichlet process (HDP) mixture models that combines the strengths of conditional slice samplers with marginal Chinese Restaurant Franchise representations.
We also provide a theoretical analysis of the algorithm’s scalability, alongside that of competing methods.
The proposed sampler dynamically instantiates only the minimal number of global atoms and tables required for exact updates, thereby enabling finite-dimensional updates without introducing systematic truncation bias.
In numerical experiments, the hybrid sampler achieves substantial per-iteration cost reductions relative to existing exact HDP algorithms, and moderate reductions relative to approximate HDP procedures, while simultaneously improving mixing and inferential performance.
[This is joint work with F. Gaffi.]
Exchangeable random permutations with an application to Bayesian graph matching
11th Floor Lecture Hall
Speaker
Francesco Gaffi, University of Bergamo
Session Chair
Tamara Broderick, Massachusetts Institute of Technology
Abstract
We introduce a general Bayesian framework for graph matching grounded in a new theory of exchangeable random permutations. Leveraging the cycle representation of permutations and the literature on exchangeable random partitions, we define, characterize, and study the structural and predictive properties of these distributions. A novel sequential metaphor—the position-aware generalized Chinese restaurant process—provides a constructive foundation for this theory and supports practical algorithmic design. Exchangeable random permutations offer flexible priors for a wide range of inferential problems where the parameter of interest is a permutation, including statistical graph matching and unmatched regression. As an application, we develop a Bayesian model for graph matching that integrates a correlated stochastic block model with an edge-discrepancy likelihood. The cycle structure of the matching permutation is linked to latent node partitions that explain connectivity patterns—an assumption consistent with the homogeneity requirement underlying the graph matching task itself. This structural alignment not only grounds the model statistically but also enhances the mixing behavior of the sampling algorithm. Posterior inference is performed through a node-wise blocked Gibbs sampler directly inspired by the proposed sequential construction, allowing coherent updates in the complex permutation space.
To summarize posterior uncertainty, we introduce perSALSO, an adaptation of the SALSO algorithm to the permutation domain that provides principled point estimation and interpretable posterior summaries. Together, these contributions establish a unified probabilistic framework for modeling, inference, and uncertainty quantification over permutations.
Multivariate species sampling processes
11th Floor Lecture Hall
Speaker
Antonio Lijoi, Bocconi University
Session Chair
David Dahl, Brigham Young University
Abstract
Species sampling processes provide a cornerstone for random discrete distributions and exchangeable sequences. Yet, analyzing data from distinct, though related, sources, a broader notion of probabilistic invariance is required, and partial exchangeability represents the natural choice. Over the past two decades many dependent nonparametric priors have been proposed- including hierarchical, nested and additive processes-in this setting. However, a unifying framework remains lacking.
We address this by introducing multivariate species sampling processes, a general class of nonparametric priors that encompasses most existing constructions. They are characterized by their partially exchangeable partition probability function, encoding the induced multivariate clustering structure. We establish their core distributional properties and analyze their dependence structure, demonstrating that borrowing of information across groups is entirely determined by shared ties. This yields new insights into their learning mechanisms, including a principled explanation of the correlation structure induced by existing models.
Besides providing a cohesive theoretical foundation, our approach serves as a constructive basis for designing new models aimed at capturing even richer dependence structures beyond the framework of multivariate species sampling processes.
Coffee Break
11th Floor Collaborative Space
Repulsive Mixture Model with Projection Determinantal Point Process
11th Floor Lecture Hall
Speaker
Mario Beraha, University of Milano Bicocca
Session Chair
Luis Nieto-Barajas, ITAM
Abstract
In many scientific domains, clustering aims to reveal interpretable latent structure that reflects relevant subpopulations or processes. Widely used Bayesian mixture models for model-based clustering often produce overlapping or redundant components because priors on cluster locations are specified independently, hindering interpretability. To mitigate this, repulsive priors have been proposed to encourage well-separated components, yet existing approaches face both computational and theoretical challenges. We introduce a fully tractable Bayesian repulsive mixture model by assigning a projection Determinantal Point Process (DPP) prior to the component locations. Projection DPPs induce strong repulsion and allow exact sampling, enabling parsimonious and interpretable posterior clustering. Leveraging their analytical tractability, we derive closed-form posterior and predictive distributions. These results, in turn, enable two efficient inference algorithms: a conditional Gibbs sampler and the first fully implementable marginal sampler for DPP-based mixtures. We also provide strong frequentist guarantees, including posterior consistency for density estimation, elimination of redundant components, and contraction of the mixing measure. Simulation studies confirm superior mixing and clustering performance compared to alternatives in misspecified settings. Finally, we demonstrate the utility of our method on event-related potential functional data, where it uncovers interpretable neuro-cognitive subgroups. Our results support the projection DPP mixtures as a theoretically sound and practically effective solution for Bayesian clustering.
Neural Network Gaussian Processes for Multiplex Networks: Joint Modeling of Dynamics and Attributes under Partial Observation
11th Floor Lecture Hall
Speaker
Sharmistha Guha, Texas A&M University
Session Chair
Luis Nieto-Barajas, ITAM
Abstract
Terrorism networks are dynamic, multiplex, and often partially observed, demanding uncertainty-aware inference. This talk presents Dynamic Joint Learner, a Bayesian framework that jointly models the co-evolution of multiplex layers and node attributes using shared, time-varying latent factors. These latent trajectories are governed by neural network Gaussian processes, combining deep-network expressiveness with principled uncertainty propagation. The method supports predictive inference on hidden links, evolving organizational attributes (size, ideology, leadership, operational capacity), and emergent communities, including friend-foe structures. Simulation studies and an application to interactions among prominent terrorist organizations show improved performance over existing approaches for link prediction, attribute forecasting, and clustering, with calibrated uncertainty. The framework offers a practical toolkit for analysts working with partially observed, co-evolving networks and is broadly applicable beyond counter-terrorism.
Friday, January 16, 2026
Data-Driven DRO and Economic Decision Theory: An Analytical Synthesis With Bayesian Nonparametric Advancements
11th Floor Lecture Hall
Speaker
Nhat Ho, University of Texas at Austin
Session Chair
Sameer Deshpande, University of Wisconsin-Madison
Abstract
We develop an analytical synthesis that bridges data-driven Distributionally Robust Optimization (DRO) and Economic Decision Theory under Ambiguity (DTA). By reinterpreting standard regularization and DRO techniques as data-driven counterparts of ambiguityaverse decision models, we provide a unified framework that clarifies their intrinsic connections. Building on this synthesis, we propose a novel DRO approach that leverages a popular DTA model of smooth ambiguity-averse preferences together with tools from Bayesian nonparametric statistics. Our baseline framework employs Dirichlet Process (DP) posteriors, which naturally extend to heterogeneous data sources via Hierarchical Dirichlet Processes (HDPs), and can be further refined to induce outlier robustness through a procedure that selectively filters poorly-fitting observations during training. Theoretical performance guarantees and convergence results, together with extensive simulations and real-data experiments, illustrate the method’s favorable performance in terms of prediction accuracy and stability.
Hierarchical Bayesian Inference with Transformers: Approximation Theory and Learned Representations
11th Floor Lecture Hall
Speaker
Sergio Bacallado, University of Cambridge
Session Chair
Sameer Deshpande, University of Wisconsin-Madison
Abstract
Transformers trained on sequential prediction tasks exhibit ""in-context learning"", the ability to adapt to new tasks at inference time given only a sequence of examples. While recent work suggests these models can simulate specific learning algorithms, the precise mechanisms remain opaque. In this talk, I will investigate this phenomenon in a controlled setting where the training data is generated by a Hierarchical Gaussian Process (HGP). In this regime, the ideal in-context learner is the posterior predictive functional Psi, which maps the context dataset and a query point to the predictive density.
First, I will discuss a theoretical framework for bounding the approximation error between a Transformer and the target functional Psi. I will outline how spectral properties of the kernel family and a covering number for the hyperparameter space govern the required network capacity, determined by architectural parameters such as depth, width, and attention head count. Second, I will present preliminary empirical evidence that Transformers trained on a prequential objective naturally recover structures aligned with these theoretical constructions. We analyze architectures where context inputs are pre-encoded by an MLP and evaluate the trained encoder's ability to represent the underlying kernel family over the space of hyperparameters. By minimizing the approximation error between the true kernel matrix and a linear reconstruction based on the encoded features, and evaluating a range of approximation metrics, we observe scenarios where the encoder learns a feature map capable of linearly representing the kernel uniformly over the hyperparameter space.
Coffee Break
11th Floor Collaborative Space
Quantile Slice Sampling
11th Floor Lecture Hall
Speaker
Matthew Heiner, Brigham Young University
Session Chair
Alessandro Zito, Harvard School of Public Health
Abstract
We propose and demonstrate a novel, effective approach to simple slice sampling. Using the probability integral transform, we first generalize Neal's shrinkage algorithm, standardizing the procedure to an automatic and universal starting point: the unit interval. This enables the introduction of approximate (pseudo-) targets through a factorization used in importance sampling, a technique that has popularized elliptical slice sampling. Reasonably accurate pseudo-targets can boost sampler efficiency by requiring fewer rejections and by reducing target skewness. This strategy is effective when a natural, possibly crude approximation to the target exists. Alternatively, obtaining a marginal pseudo-target from initial samples provides an intuitive and automatic tuning procedure. We consider pseudo-target specification and interpretable diagnostics. We examine performance of the proposed sampler relative to other popular, easily implemented MCMC samplers on standard targets in isolation, and as steps within a Gibbs sampler in a Bayesian modeling context. We prospectively extend to multivariate slice samplers that target large discrete spaces commonly encountered in Bayesian nonparametrics. R package qslice is available on CRAN.
Leveraging External Data for Testing Experimental Therapies with Biomarker Interactions in Randomized Clinical Trials
11th Floor Lecture Hall
Speaker
Lorenzo Trippa, Harvard University
Session Chair
Alessandro Zito, Harvard School of Public Health
Abstract
In oncology the efficacy of novel therapeutics often differs across patient subgroups, and these variations are difficult to predict during the initial phases of the drug development process. The relation between the power of randomized clinical trials and heterogeneous treatment effects has been discussed by several authors. In particular, false negative results are likely to occur when the treatment effects concentrate in a subpopulation but the study design did not account for potential heterogeneous treatment effects. The use of external data from completed clinical studies and electronic health records has the potential to improve decision-making throughout the development of new therapeutics, from early-stage trials to registration. Here we discuss the use of external data to evaluate experimental treatments with potential heterogeneous treatment effects. We introduce a permutation procedure to test, at the completion of a randomized clinical trial, the null hypothesis that the experimental therapy does not improve the primary outcomes in any subpopulation. The permutation test leverages the available external data to increase power. Also, the procedure controls the false positive rate at the desired 𝛼-level without restrictive assumptions on the external data, for example, in scenarios with unmeasured confounders, different pre-treatment patient profiles in the trial population compared to the external data, and other discrepancies between the trial and the external data. We illustrate that the permutation test is optimal according to an interpretable criteria and discuss examples based on asymptotic results and simulations, followed by a retrospective analysis of individual patient- level data from a collection of glioblastoma clinical trials.