Organizing Committee
Abstract

This workshop focuses on the role of random matrices in data science, machine learning, and theoretical computer science.

Playing a significant role in modern data science, random matrices provide an elegant way to represent both (a) the data and (b) the way we process it. To give an example of (a), the classical model of high-dimensional data is a set of points drawn from a certain distribution in a high-dimensional space which can be represented as a random matrix. An even more natural example is that most data in practice is noisy, so it can be represented as a deterministic matrix plus a random noise, which is a random matrix with a non-zero mean. An example of (b) is data compression, which can be realized by applying a random matrix of smaller sizes to the data in (a), thereby reducing its dimensions. Another example is data completion, where we need to reconstruct data (in the form of a matrix) from a random sub-matrix, given that the original matrix satisfies certain structural constraints (such as being low rank).

To deal with these problems, we need to develop techniques beyond the scope of standard random matrix theory. For example, while the sensitivity of matrix spectrum to small random perturbations is extensively studied in the mean field regime, the problem in the setting of structured random matrices, which allows for more adequate modeling of real-world data, is wide open. As another example, the core mechanism of an artificial neural network is a composition of linear and nonlinear transformations, leading to nonlinear generalizations of random matrices. Our understanding of mathematical principles behind examples like this is still in its infancy.

The goal is to discuss a number of hot topics in these areas, focusing on recent theoretical progress and the potential pool of applications, in a manner that is accessible to a wide audience, including researchers in data science, electrical engineering, statistics, and numerical analysis.

Image for "Random Matrices and Applications"

Confirmed Speakers & Participants

Talks will be presented virtually or in-person as indicated in the schedule below.

  • Speaker
  • Poster Presenter
  • Attendee
  • Virtual Attendee
  • Elie Alhajjar
    RAND
  • Lucas Benigni
    Université de Montréal
  • Tatiana Brailovskaya
    Princeton University
  • Jonathan Chavez Casillas
    University of Rhode Island
  • Qianyong Chen
    UMass Amherst
  • Shabarish Chenakkod
    University of Michigan
  • Hongmei Chi
    Florida A&M University
  • Julia E Grigsby
    Boston College
  • Alperen Ergur
    UTSA
  • Zhou Fan
    Yale University
  • Yiyun He
    University of California, Irvine
  • Vishesh Jain
    University of Chicago, Illinois
  • Sungwoo Jeong
    Cornell University
  • Han Le
    University of Michigan
  • Yuqi Lei
    Brown University
  • Sivan Leviyang
    Georgetown University
  • Weiyu Li
    Harvard University
  • Shuangping Li
    Stanford University
  • Wenjian Liu
    City University of New York
  • Ji Meng Loh
    New Jersey Institute of Technology
  • Jackie Lok
    Princeton University
  • Hengrui Luo
    Lawrence Berkeley National Laboratory/ Rice University
  • renyuan ma
    Yale University
  • Benjamin McKenna
    Harvard University
  • Andrea Montanari
    Stanford University
  • Raj Nadakuditi
    University of Michigan
  • Hoi Nguyen
    Ohio State University
  • Oanh Nguyen
    Brown University
  • Mihai Nica
    University of Guelph
  • Petar Nizic-Nikolac
    ETH Zurich
  • Sean O'Rourke
    University of Colorado Boulder
  • Aaradhya Pandey
    Princeton University
  • Jing Qin
    University of Kentucky
  • Liza Rebrova
    Princeton
  • Daniel Reichman
    Worcester Polytechnic Institute, Worcester, MA
  • Elad Romanov
    Stanford University
  • Julian Sahasrabudhe
    University of Cambridge
  • Mehtaab Sawhney
    MIT
  • Calum Shearer
    University of Colorado Boulder
  • Paul Simanjuntak
    Texas A&M University
  • Youngtak Sohn
    MIT
  • Reed Spitzer
    Brown University
  • JINGYE TAN
    Cornell University
  • Konstantin Tikhomirov
    Carnegie Mellon University
  • Linh Tran
    Yale University
  • Phuc Tran
    Yale University
  • Alexander Van Werde
    Eindhoven University of Technology
  • Roman Vershynin
    University of California, Irvine
  • Van Vu
    Yale University
  • Xiong Wang
    Johns Hopkins University
  • Haixiao Wang
    University of California, San Diego
  • Ke Wang
    Hong Kong University of Science and Technology
  • Jingheng Wang
    Ohio State University
  • Zhichao Wang
    University of California San Diego
  • Lu WEI
    Texas Tech University
  • Garrett Wen
    Yale University
  • Haishen Yao
    The City University of New York, Queensborough Community College
  • Ziao Zhang
    Brown University
  • Yifan Zhang
    University of Texas at Austin
  • Yizhe Zhu
    University of California Irvine

Workshop Schedule

Monday, May 20, 2024
  • 8:50 - 9:00 am EDT
    Welcome
    11th Floor Lecture Hall
    • Session Chair
    • Brendan Hassett, ICERM/Brown University
  • 9:00 - 9:15 am EDT
    Opening Remarks
    11th Floor Lecture Hall
    • Oanh Nguyen, Brown University
    • Hoi Nguyen, Ohio State University
    • Konstantin Tikhomirov, Carnegie Mellon University
    • Roman Vershynin, University of California, Irvine
    • Van Vu, Yale University
  • 9:15 - 10:15 am EDT
    Matrix invertibility and applications.
    Survey - 11th Floor Lecture Hall
    • Speaker
    • Konstantin Tikhomirov, Carnegie Mellon University
    • Session Chair
    • Van Vu, Yale University
  • 10:15 - 10:45 am EDT
    Coffee Break
    11th Floor Collaborative Space
  • 10:45 - 11:25 am EDT
    The limiting spectral law of sparse random matrices
    11th Floor Lecture Hall
    • Speaker
    • Julian Sahasrabudhe, University of Cambridge
    • Session Chair
    • Van Vu, Yale University
    Abstract
    Let A be and n by n matrix with iid entries which take 1 with probability p and 0 otherwise. In this talk we will be interested in the distribution of the complex eigenvalues of A as n tends to infinity. Due to an impressive sequence of results, today we know that whenever pn tends to infinity and A is appropriately normalized, the limiting spectral distribution tends to the ""circular law"" and n tends to infinity. However the situation for p = d/n, d fixed, has remained somewhat more mysterious and, in fact, it was not even known if a deterministic limiting law even exists. In this talk I will discuss the recent resolution of this problem, showing that such random matrices have a deterministic limiting spectral distribution. This talk is based on joint work with Ashwin Sah and Mehtaab Sawhney.
  • 11:35 am - 12:15 pm EDT
    Improved very sparse matrix completion via an "asymmetric randomized SVD"
    11th Floor Lecture Hall
    • Speaker
    • Raj Rao, University of Michigan
    • Session Chair
    • Van Vu, Yale University
    Abstract
    We consider the matrix completion problem in the very sparse regime where, on average, a constant number of entries of the matrix are observed per row (or column). In this very sparse regime, we cannot expect to have perfect recovery and the celebrated nuclear norm based matrix completion fails because the singular value decomposition (SVD) of the underlying very sparse matrix completely breaks down. We demonstrate that it is indeed possible to reliably recover the matrix. The key idea is the use of a ""randomized asymmetric SVD"" to find informative singular vectors in this regime in a way that the SVD cannot. We provide sharp theoretical analysis of the phenomenon, the lower limits of statistical recovery and demonstrate the efficacy of the new method using simulations.
  • 12:25 - 12:30 pm EDT
    Group Photo (Immediately After Talk)
    11th Floor Lecture Hall
  • 12:30 - 2:00 pm EDT
    Lunch/Free Time
  • 2:00 - 2:40 pm EDT
    Universality of Spectral Independence with Applications to Fast Mixing in Spin Glasses
    11th Floor Lecture Hall
    • Speaker
    • Vishesh Jain, University of Chicago, Illinois
    • Session Chair
    • Roman Vershynin, University of California, Irvine
    Abstract
    We study the Glauber dynamics for sampling from distributions on the n-dimensional discrete Boolean hypercube. Techniques based on the powerful recent notion of spectral independence have successfully yielded optimal O(n) relaxation times for a host of different distributions. We show that spectral independence is universal: a relaxation time of O(n) implies spectral independence. As an application, we obtain approximate tensorization of entropy and the optimal mixing time for the classically studied mixed p-spin model at sufficiently high temperature. Joint work with Nima Anari, Frederic Koehler, Huy Tuan Pham, and Thuy-Duong Vuong.
  • 2:50 - 3:30 pm EDT
    Extreme eigenvalues of random Laplacian matrices
    11th Floor Lecture Hall
    • Virtual Speaker
    • Sean O'Rourke, University of Colorado Boulder
    • Session Chair
    • Roman Vershynin, University of California, Irvine
    Abstract
    Many properties of a graph can be deduced from the eigenvalues and eigenvectors of its Laplacian matrix. Because of this, the Laplacian matrix plays a key role in spectral graph theory and is used in algorithms for graph clustering, network control theory, and combinatorial optimization. This talk focuses on the behavior of the extreme eigenvalues of a random Laplacian matrix with Gaussian entries. We show that, after shifting and scaling, the largest eigenvalue converges to a Gumbel distribution as the size of the matrix tends to infinity. The proof uses techniques from random matrix theory and free probability. This talk is based on joint work with Andrew Campbell and Kyle Luh.
  • 3:40 - 4:10 pm EDT
    Coffee Break
    11th Floor Collaborative Space
  • 4:10 - 4:50 pm EDT
    Analysis of singular subspaces under random perturbations
    11th Floor Lecture Hall
    • Speaker
    • Ke Wang, Hong Kong University of Science and Technology
    • Session Chair
    • Roman Vershynin, University of California, Irvine
    Abstract
    Matrix perturbation theory is a foundational subject across multiple disciplines, including probability, statistics, machine learning, and applied mathematics. Perturbation bounds play a critical role in quantifying the impact of small noise on matrix spectral parameters in various applications such as matrix completion, principal component analysis, and community detection. In this talk, we focus on the additive perturbation model, where a low-rank data matrix is perturbed by Gaussian noise. We provide a comprehensive analysis of the singular subspaces, extending the classical Davis-Kahan-Wedin theorem and offering fine-grained analysis of the singular vector matrices. Also, we will show the practical implications of our perturbation results, specifically highlighting their application in Gaussian mixture models. The talk is based on joint works with Sean O'Rourke and Van Vu.
  • 5:00 - 6:30 pm EDT
    Reception
    11th Floor Collaborative Space
Tuesday, May 21, 2024
  • 9:00 - 10:00 am EDT
    A new look at some basic perturbation bounds
    Survey - 11th Floor Lecture Hall
    • Speaker
    • Van Vu, Yale University
    • Session Chair
    • Konstantin Tikhomirov, Carnegie Mellon University
    Abstract
    In applications, we often need perturbation bounds for the leading spectral parameters of a matrix, such as the leading eigenvectors, eigenvalues, eigensubspaces or low rank approximation. In this talk, we are going to discuss a new kind of perturbation bounds, which would improve significantly classical bounds such as Davis-Kahan or Weyl, given that the spectrum of the truth matrix decays in a reasonable manner. (based on joined works with S. O'rourke, K. Wang, P. Tran)
  • 10:00 - 10:30 am EDT
    Coffee Break
    11th Floor Collaborative Space
  • 10:30 - 11:10 am EDT
    Improved Principal Component Analysis for Sample Covariance Matrix
    11th Floor Lecture Hall
    • Speaker
    • Phuc Tran, Yale University
    • Session Chair
    • Konstantin Tikhomirov, Carnegie Mellon University
    Abstract
    Using the sample covariance to estimate the principal components of a hidden covariance matrix is a central problem in statistics and data science. We view this as a perturbation problem where the noise is a zero mean random matrix which is the difference between the sample covariance matrix and the truth. In certain special settings, it can also be viewed as a deformed random matrix model. In this talk, using recent progress on matrices with random perturbation, we derive new estimates, which extend and improve several well known results in the field. (join work with V. Vu)
  • 11:20 am - 12:00 pm EDT
    On a Conjecture of Spielman–Teng
    11th Floor Lecture Hall
    • Speaker
    • Mehtaab Sawhney, MIT
    • Session Chair
    • Konstantin Tikhomirov, Carnegie Mellon University
    Abstract
    Let J be a mean 0, variance 1 random variable with subGaussian tails. Let M be an n by n matrix with iid entries J and G be an n by n matrix with iid standard normal entries. We prove that for all nonnegative x that P[s_n(M)<x/n^{1/2}] = (1+o(1))P[s_n(G)<x/n^{1/2}] + O(e^{-cn}). This resolves up to a (1+o(1)) a conjecture of Spielman and Teng regarding the distribution of the least singular value. This furthermore unifies works of Tao and Vu (on the distribution of the least singular value) and Rudelson and Vershynin (which proved the conjecture up to an absolute constant). Joint with Ashwin Sah and Julian Sahasrabudhe
  • 12:15 - 2:00 pm EDT
    Lunch/Free Time
  • 2:00 - 2:40 pm EDT
    Matrix superconcentration inequalities
    11th Floor Lecture Hall
    • Speaker
    • Tatiana Brailovskaya, Princeton University
    • Session Chair
    • Hoi Nguyen, Ohio State University
    Abstract
    One way to understand the concentration of the norm of a random matrix X with Gaussian entries is to apply a standard concentration inequality, such as the one for Lipschitz functions of i.i.d. standard Gaussian variables, which yields subgaussian tail bounds on the norm of X. However, as was shown by Tracy and Widom in 1990s, when the entries of X are i.i.d. the norm of X exhibits even sharper concentration. The phenomenon of a function of many i.i.d. variables having strictly smaller tails than those predicted by classical concentration inequalities is sometimes referred to as «superconcentration», a term originally dubbed by Chatterjee. I will discuss novel results that can be interpreted as superconcentration inequalities for the norm of X, where X is a Gaussian random matrix with independent entries and an arbitrary variance profile. We can also view our results as a nonhomogeneous extension of Tracy-Widom-type upper tail estimates for the norm of X.
  • 2:50 - 3:30 pm EDT
    Kronecker-product random matrices and a matrix least squares problem
    11th Floor Lecture Hall
    • Speaker
    • Zhou Fan, Yale University
    • Session Chair
    • Hoi Nguyen, Ohio State University
    Abstract
    We study the asymptotic spectral distribution and resolvent of a n^2-by-n^2 Kronecker-product random matrix A x I + I x B + Theta x Xi, when A, B are independent Wigner matrices and Theta, Xi are deterministic and diagonal. For fixed spectral arguments z in C^+, we establish a quantitative approximation for the Stieltjes transform by that of an approximating free operator, and a diagonal deterministic equivalent approximation for the resolvent. We further perform an entrywise analysis of the resolvent, showing that off-diagonal entries fall on two differing scales n^{-1/2} and n^{-1} depending on their locations with respect to the Kronecker structure. Our study is motivated by consideration of a matrix-valued least-squares optimization problem min_{X in R^{n x n}} \|XA+BX\|_F^2 + sum_{ij} Theta_i Xi_j X_{ij}^2 subject to a linear constraint. For random instances of this problem defined by Wigner inputs A and B, our analyses imply an asymptotic characterization of the minimizer X and its associated minimum objective value as in the limit as n -> infinity. This is joint work with Renyuan (Jack) Ma.
  • 3:40 - 4:10 pm EDT
    Coffee Break
    11th Floor Collaborative Space
  • 4:10 - 5:00 pm EDT
    Solving overparametrized systems of random equations
    Survey - 11th Floor Lecture Hall
    • Speaker
    • Andrea Montanari, Stanford University
    • Session Chair
    • Hoi Nguyen, Ohio State University
    Abstract
    I consider the problem of efficiently solving a system of n non-linear equations in R^d, where each equation is random homogeneous polynomial of arbitrary degrees. In the complex case and for n=d−1, Beltran and Pardo proved the existence of an efficient randomized algorithm and Lairez recently showed it can be de-randomized to produce a deterministic efficient algorithm. Here we consider the real setting, to which previously developed methods do not apply. We describe an algorithm that efficiently finds solutions (with high probability) for n=d−O(sqrt{dlog d}). [Based on joint work with Eliran Subag]
Wednesday, May 22, 2024
  • 9:00 - 10:00 am EDT
    Recent developments in nonlinear random matrices
    Survey - 11th Floor Lecture Hall
    • Speaker
    • Yizhe Zhu, University of California Irvine
    • Session Chair
    • Oanh Nguyen, Brown University
    Abstract
    Nonlinear random matrices have emerged as an important area of interest in statistics and machine learning. In recent years, the study of nonlinear random matrix theory has proven valuable in providing rigorous explanations for empirically observed phenomena in deep learning. From a random matrix perspective, this creates new challenges in handling nonlinearity and dependence. In this talk, I will survey recent developments in the spectral analysis of nonlinear random matrices and their applications. This includes discussions on random feature models, kernel random matrices, and random geometric graphs.
  • 10:00 - 10:30 am EDT
    Coffee Break
    11th Floor Collaborative Space
  • 10:30 - 11:10 am EDT
    Algorithmic Thresholds for Ising Perceptron
    11th Floor Lecture Hall
    • Speaker
    • Shuangping Li, Stanford University
    • Session Chair
    • Oanh Nguyen, Brown University
    Abstract
    The Ising perceptron is a toy model of a single-layer neural network, which can be viewed as a random constraint satisfaction problem with dense connectivity. The problem requires finding a length-N vector of +1 or -1 entries, such that its product with a random Gaussian M by N matrix is entrywise lower bounded. We work in the regime where both M and N grow large with a fixed ratio M/N = alpha. In this talk, we present some recent findings on the algorithmic aspects of the Ising perceptron. Inspired by the rich literature on discrepancy minimization, we propose an efficient algorithm that improves on previous algorithms (works for higher values of alpha). We also show that at high enough alpha, the Ising perceptron exhibits the multi Overlap Gap Property (m-OGP), which matches our algorithmic threshold up to poly-logarithmic factors. This is based on joint work with Tselil Schramm and Kangjie Zhou.
  • 11:20 am - 12:00 pm EDT
    Neural Networks and Products of Random Matrices in the Infinite Depth-and-Width Limit
    11th Floor Lecture Hall
    • Speaker
    • Mihai Nica, University of Guelph
    • Session Chair
    • Oanh Nguyen, Brown University
    Abstract
    Neural networks have become so large that their behaviour can be well approximated by “infinite neural networks”, which are obtained by considering the limit as the number of neurons goes to infinity. However, there are many possible infinite limits one can take! In this talk, I will introduce the "infinite depth-and-width limit", where both the depth and the width are scaled to infinity simultaneously. Mathematically, this limit corresponds to a limit of products of random matrices where *both* the size of each individual matrix and the number of matrices in the product are scaled proportionally.
  • 12:10 - 2:00 pm EDT
    Lunch/Free Time
  • 2:00 - 2:40 pm EDT
    Spectrum of the Neural Tangent Kernel in a quadratic scaling
    11th Floor Lecture Hall
    • Speaker
    • Lucas Benigni, Université de Montréal
    • Session Chair
    • Hoi Nguyen, Ohio State University
    Abstract
    Despite their surplus of parameters, modern deep learning models often generalize well, a phenomenon exemplified by the "double descent curve." While this behavior is theoretically grasped for problems such as ridge regression under linear scaling of dimensions, intriguing phenomenon emerge under quadratic scaling, where sample size equals parameter count. In this presentation, we study the eigenvalues of the Neural Tangent Kernel, a matrix model pertinent to wide neural networks trained via gradient descent, within this quadratic regime.
  • 2:50 - 3:30 pm EDT
    The High-Dimensional Asymptotics of Principal Component Regression
    11th Floor Lecture Hall
    • Speaker
    • Elad Romanov, Stanford University
    • Session Chair
    • Hoi Nguyen, Ohio State University
    Abstract
    Principal component regression (PCR) is a classical two-step approach to linear regression, where one first reduces the data dimension by projecting onto its large principal components, and then performs ordinary least squares regression. We study PCR in an asymptotic high-dimensional regression setting, where the number of data points is proportional to the dimension. We derive exact limiting formulas for the estimation and prediction risks, which depend in a complicated manner on the eigenvalues of the population covariance, the alignment between the population PCs and the true signal, and the number of selected PCs. A key challenge is the fact that in this regime, the sample covariance is an inconsistent estimate of its population counterpart, hence sample PCs may fail to fully capture potential latent low-dimensional structure in the data. We demonstrate this point through several case studies, including that of a spiked covariance model. Joint work with Alden Green.
  • 3:40 - 4:10 pm EDT
    Coffee Break
    11th Floor Collaborative Space

All event times are listed in ICERM local time in Providence, RI (Eastern Daylight Time / UTC-4).

All event times are listed in .

Request Reimbursement

This section is for general purposes only and does not indicate that all attendees receive funding. Please refer to your personalized invitation to review your offer.

ORCID iD
As this program is funded by the National Science Foundation (NSF), ICERM is required to collect your ORCID iD if you are receiving funding to attend this program. Be sure to add your ORCID iD to your Cube profile as soon as possible to avoid delaying your reimbursement.
Acceptable Costs
  • 1 roundtrip between your home institute and ICERM
  • Flights in economy class to either Providence airport (PVD) or Boston airport (BOS)
  • Ground Transportation to and from airports and ICERM.
Unacceptable Costs
  • Flights on U.K. airlines
  • Seats in economy plus, business class, or first class
  • Change ticket fees of any kind
  • Multi-use bus passes
  • Meals or incidentals
Advance Approval Required
  • Personal car travel to ICERM from outside New England
  • Multiple-destination plane ticket; does not include layovers to reach ICERM
  • Arriving or departing from ICERM more than a day before or day after the program
  • Multiple trips to ICERM
  • Rental car to/from ICERM
  • Arriving or departing from airport other than PVD/BOS or home institution's local airport
  • 2 one-way plane tickets to create a roundtrip (often purchased from Expedia, Orbitz, etc.)
Travel Maximum Contributions
  • New England: $350
  • Other contiguous US: $850
  • Asia & Oceania: $2,000
  • All other locations: $1,500
  • Note these rates were updated in Spring 2023 and superseded any prior invitation rates. Any invitations without travel support will still not receive travel support.
Reimbursement Requests

Request Reimbursement with Cube

Refer to the back of your ID badge for more information. Checklists are available at the front desk and in the Reimbursement section of Cube.

Reimbursement Tips
  • Scanned original receipts are required for all expenses
  • Airfare receipt must show full itinerary and payment
  • ICERM does not offer per diem or meal reimbursement
  • Allowable mileage is reimbursed at prevailing IRS Business Rate and trip documented via pdf of Google Maps result
  • Keep all documentation until you receive your reimbursement!
Reimbursement Timing

6 - 8 weeks after all documentation is sent to ICERM. All reimbursement requests are reviewed by numerous central offices at Brown who may request additional documentation.

Reimbursement Deadline

Submissions must be received within 30 days of ICERM departure to avoid applicable taxes. Submissions after thirty days will incur applicable taxes. No submissions are accepted more than six months after the program end.