Organizing Committee
- Ian Adelstein
Yale University - Jeffrey Brock
Yale University - Smita Krishnaswamy
Yale University - Bjorn Sandstede
Brown University
Abstract
The goal of this meeting is to bring together researchers using geometric and topological methods to study data. Fields of interest include manifold learning, topological data analysis, neural networks, and machine learning. While this plan is to focus on the mathematics, applications to neuroscience and quantitative biology will also be explored.

This workshop is part of the Brown Data Science Initiative project, funded by the National Science Foundation's TRIPODS program
Confirmed Speakers & Participants
Talks will be presented virtually or in-person as indicated in the schedule below.
- Speaker
- Poster Presenter
- Attendee
- Virtual Attendee
-
Ian Adelstein
Yale University
-
Erik Bergland
Brown University
-
Dhananjay Bhaskar
Yale University
-
Jeffrey Brock
Yale University
-
Rui Ding
Stony Brook University
-
Thierry Emonet
Yale University
-
Anna Gilbert
Yale University
-
Yannis Kevrekidis
Johns Hopkins University
-
Mona Khoshnevis
Brown University
-
Bulent Kokluce
Brown University
-
Sudipta Kolay
ICERM
-
Smita Krishnaswamy
Yale University
-
Alice Kwon
Suny maritime
-
Roy Lederman
Yale University
-
Tao Liu
Brown University
-
John Murray
Yale University
-
Jennifer Paige
Swarthmore College
-
Michael Perlmutter
University of California, Los Angeles
-
Gabriel Provencher Langlois
Brown University
-
Bjorn Sandstede
Brown University
-
Wai Shing Tang
Brown University
-
Ying Hong Tham
Albert Einstein College of Medicine
-
Dawson Thomas
Yale University
-
Moyi Tian
Brown University
-
Alexandria Volkening
Purdue University
-
Tony Wong
ICERM
-
Joon-Hyeok Yim
Yale University
-
Kisung You
Yale University
-
William Zhang
Brown University
-
Sarah Zhao
Yale University
-
Wenjun Zhao
Brown University
Workshop Schedule
Your device timezone is . Do you want to view schedules in or choose a custom timezone?
Thursday, December 16, 2021
-
8:45 - 9:00 am ESTWelcome11th Floor Lecture Hall
- Jeffrey Brock, Yale University
- Bjorn Sandstede, Brown University
-
9:00 - 9:45 am ESTGeometry of Molecular Conformations in Cryo-EM11th Floor Lecture Hall
- Speaker
- Roy Lederman, Yale University
- Session Chair
- Jeffrey Brock, Yale University
Abstract
Cryo-Electron Microscopy (cryo-EM) is an imaging technology that is revolutionizing structural biology. Cryo-electron microscopes produce many very noisy two-dimensional projection images of individual frozen molecules; unlike related methods, such as computed tomography (CT), the viewing direction of each particle image is unknown. The unknown directions and extreme noise make the determination of the structure of molecules challenging. While other methods for structure determination, such as x-ray crystallography and NMR, measure ensembles of molecules, cryo-electron microscopes produce images of individual particles. Therefore, cryo-EM could potentially be used to study mixtures of conformations of molecules. We will discuss a range of recent methods for analyzing the geometry of molecular conformations using cryo-EM data.
-
10:00 - 10:30 am ESTCoffee Break11th Floor Collaborative Space
-
11:00 - 11:45 am ESTGEOMETRIC AND TOPOLOGICAL APPROACHES TO REPRESENTATION LEARNING IN BIOMEDICAL DATA11th Floor Lecture Hall
- Speaker
- Smita Krishnaswamy, Yale University
- Session Chair
- Jeffrey Brock, Yale University
Abstract
High-throughput, high-dimensional data has become ubiquitous in the biomedical sciences as a result of breakthroughs in measurement technologies and data collection. While these large datasets containing millions of observations of cells, peoples, or brain voxels hold great potential for understanding generative state space of the data, as well as drivers of differentiation, disease and progression, they also pose new challenges in terms of noise, missing data, measurement artifacts, and the so-called “curse of dimensionality.” In this talk, I will cover data geometric and topological approaches to understanding the shape and structure of the data. First, we show how diffusion geometry and deep learning can be used to obtain useful representations of the data that enable denoising, dimensionality reduction. Next we show how to combine diffusion geometry with topology to extract multi-granular features from the data to assist in differential and predictive analysis. On the flip side, we also create a manifold geometry from topological descriptors, and show its applications to neuroscience. Finally, we will show how to learn dynamics from static snapshot data by using a manifold-regularized neural ODE-based optimal transport. Together, we will show a complete framework for exploratory and unsupervised analysis of big biomedical data.
-
12:00 - 12:10 pm ESTGroup Photo (Immediately After Talk)11th Floor Lecture Hall
-
12:10 - 1:30 pm ESTLunch/Free Time
-
1:30 - 2:15 pm ESTMetric Repair11th Floor Lecture Hall
- Speaker
- Anna Gilbert, Yale University
- Session Chair
- Bjorn Sandstede, Brown University
Abstract
Metric embeddings are key algorithmic and mathematical techniques in applied mathematics and approximation algorithms, and their adaptations are ubiquitous in machine learning. They are used to embed one metric space into another with the hope of revealing hidden structure or reducing the dimension of a data set. Examples include the random projection of a set of points in high dimensions to a lower dimension and the embedding of a graph into a tree-like structure. The fundamental limitation with the application of metric embeddings to machine learning is that their use in data analysis is predicated upon the input data coming from a metric space. Real data, however, do not necessarily conform to a metric; they are messy. The fundamental problem in our research program is metric repair: given a set of input distances, adjust them so that they conform to a metric.
-
2:30 - 3:00 pm ESTCoffee Break11th Floor Collaborative Space
-
3:00 - 3:45 pm ESTFrom Questionnaires to PDEs: Dynamics and Emergent Models from Disorganized Data11th Floor Lecture Hall
- Virtual Speaker
- Yannis Kevrekidis, Johns Hopkins University
- Session Chair
- Bjorn Sandstede, Brown University
Abstract
Starting with sets of disorganized observations of spatiotemporally evolving systems obtained at different (also disorganized) sets of parameters, we demonstrate the data-driven derivation of generative, parameter dependent, evolutionary partial differential equation models of the data. We know what observations were made at the same physical location, the same time or the same set of parameter values - knowing neither where the physical location is, nor when the temporal moment is, nor what the parameter values are; this tensor type of data is reminiscent of shuffled (multi)-puzzle tiles .
The {\em independent variables} for the evolution equations (their ``space"" and ``time"") as well as their effective parameters are all ``emergent"", i.e. determined in a data-driven way from our disorganized observations of behavior in them.
We use a diffusion map based ``questionnaire"" approach to build a parametrization of our emergent space for the data. This approach iteratively processes the data by successively observing them on the ``space"", the ``time"" and the ``parameter"" axes of a tensor. Once the data are organized, we use neural-network-based learning to approximate the operators governing the evolution equations in this emergent space. Our illustrative example is based on a previously developed vertex-plus-signaling model of \textit{Drosophila} embryonic development. This allows us to discuss features of the process like symmetry breaking, translational invariance of the emergent PDE model, and interpretability. -
4:00 - 4:45 pm ESTTopological data analysis of zebrafish patterns11th Floor Lecture Hall
- Virtual Speaker
- Alexandria Volkening, Purdue University
- Session Chair
- Bjorn Sandstede, Brown University
Abstract
Self-organization is present at many scales in biology, and here I will focus specifically on elucidating how brightly colored cells interact to form skin patterns in zebrafish. Wild-type zebrafish are named for their dark and light stripes, but mutant zebrafish feature variable skin patterns, including spots and labyrinth curves. All of these patterns form as the fish grow due to the interactions of tens of thousands of pigment cells, making agent-based modeling a natural approach for describing pattern formation. By identifying cell interactions that may change to create mutant patterns, my longterm goal is to help link genes, cell behavior, and visible animal characteristics in fish. However, agent-based models are stochastic and have many parameters, so comparing simulated patterns and fish images is often a qualitative process. Developing analytically tractable continuum models from agent-based systems is one means of addressing these challenges and better understanding the roles of different parameters in pattern formation. Alternatively, methods from topological data analysis can be applied to cell-based systems directly. In this talk, I will overview our models and present quantitative comparisons of in silico and in vivo cell-based patterns using our topological methods.
-
5:00 - 6:00 pm ESTReception11th Floor Collaborative Space
Friday, December 17, 2021
-
9:00 - 9:45 am ESTRobust and Scalable Learning of Gaussian Mixture Models11th Floor Lecture Hall
- Speaker
- Kisung You, Yale University
- Session Chair
- Ian Adelstein, Yale University
Abstract
A Gaussian mixture model (GMM) is one of the highlighted methods in both machine learning and statistics communities for probabilistic clustering and density estimation. Estimation of the model is usually executed by the expectation-maximization (EM)-like algorithm. When the sample size is large, however, the EM algorithm may not be a convenient option due to exponential growth in computational costs. In this talk, I present a divide-and-conquer approach with minimal communication to resolve this problem by working with a Hilbertian structure of GMMs induced by kernel embedding of Gaussian measures. This is done by estimating multiple models on independent subsets of the data and aggregating those into a single GMM by geometric median in the Hilbert space, which guarantees robustness of the estimate under mild conditions. Next, once the estimate is achieved, it may contain overly redundant components in that the obtained clustering is not meaningful and interpretation of each component becomes incomprehensible. Upon the observation, two postprocessing strategies for model reduction and clustering characterization are proposed.
-
10:00 - 10:30 am ESTCoffee Break11th Floor Collaborative Space
-
11:00 - 11:45 am ESTCharacterizing Transitions in Developmental Biology using Topological Machine Learning11th Floor Lecture Hall
- Speaker
- Dhananjay Bhaskar, Yale University
- Session Chair
- Ian Adelstein, Yale University
Abstract
I will present on-going work applying topological data analysis (TDA) and machine learning to identify transitions in cell organization and cell state within the context of developmental biology. First, using cell positions obtained from agent-based simulations of cell sorting and skin pigmentation, the complex relationship between cell-cell interactions and emergent patterns is automatically discovered via unsupervised classification of persistence images. This approach is used to analyze phase transitions in proliferating, heterogeneous populations and found to be empirically robust to random perturbations and finite-size effects. Next, I will discuss challenges associated with TDA of high-dimensional single cell sequencing datasets. In particular, lack of suitable techniques for intrinsic dimension and curvature estimation is limiting the use of multi-parameter filtration as a tool for understanding these data. I will briefly outline a novel approach for tackling this problem, using graph diffusion probabilities to predict curvature on toy data consisting of points sampled from quadric surfaces.
-
12:00 - 1:15 pm ESTLunch/Free Time
-
1:15 - 2:00 pm ESTGeometry of Neural Representations Shapes Multi-Task Function in Neural Networks and Humans11th Floor Lecture Hall
- Speaker
- John Murray, Yale University
- Session Chair
- Smita Krishnaswamy, Yale University
Abstract
Flexible cognitive behavior requires the ability to learn and perform a diversity of tasks without detrimental interference. What are the geometric properties of neural representations that support multi-task learning and function? In this talk I will present recent and ongoing studies integrating computational modeling and empirical data to link the representational geometry of neural networks to cognitive function.
-
2:15 - 2:45 pm ESTCoffee Break11th Floor Collaborative Space
-
2:45 - 3:30 pm ESTConnecting molecules to individual cell behavior to emergent collective behavior11th Floor Lecture Hall
- Speaker
- Thierry Emonet, Yale University
- Session Chair
- Smita Krishnaswamy, Yale University
Abstract
Cells live in communities where they interact with each other and their environment. By coordinating individuals, such interactions often result in collective behavior and function that emerge on scales larger than the individuals and are beneficial to the population. At the same time, populations of individuals, even isogenic ones, display phenotypic heterogeneity, which diversifies individual behavior and enhances the resilience of the population in unexpected situations. This raises a dilemma: although individuality provides advantages, it also tends to reduce coordination. I will discuss our experimental and theoretical efforts that use bacterial chemotaxis as model system to understand the origin of individual cellular behavior and performance, and how populations of cells reconciliate individuality with group behavior to robustly operate in multiple environments. Bacterial chemotaxis is one of the best understood model systems of all of biology. As such it enables us to examine both experimentally and theoretically how dynamical interactions at one scale give rise to structure and function at the next (larger) scale. Thus, it is a great testbed for novel mathematical methods to study data.
-
3:45 - 4:30 pm ESTGeometric Scattering And Applications11th Floor Lecture Hall
- Speaker
- Michael Perlmutter, University of California, Los Angeles
- Session Chair
- Smita Krishnaswamy, Yale University
Abstract
The scattering transform is a mathematical model of convolutional neural networks (CNNs) introduced for functions defined on Euclidean space by Stephan\'e Mallat. It differs from traditional CNNs by using predesigned, wavelet filters rather than filters which are learned from training data. This leads to a network which provably has stability and invariance guarantees. Moreover, in situations where the wavelets can be designed in correspondence to underlying physics, it can produce very good numerical results. The rise of geometric deep learning motivated the introduction of geometric scattering transforms for data sets modeled as graphs or manifolds. These networks use wavelets constructed using the spectral decompositions of an appropriate Laplacian operator or via polynomials of a diffusion operator. In my talk, I will discuss applications of these networks to a variety of geometric deep learning tasks and show that they have analogous stability and invariance guarantees to their Euclidean predecessor. I will then talk about modifications of the graph scattering transform which can increase numerical performance and also work using the graph scattering transform as the front end of an encoder-decoder network for the purposes of molecule generation.
All event times are listed in ICERM local time in Providence, RI (Eastern Standard Time / UTC-5).
All event times are listed in .
ICERM local time in Providence, RI is Eastern Standard Time (UTC-5). Would you like to switch back to ICERM time or choose a different custom timezone?
Schedule Timezone Updated