Organizing Committee
 Hoi Nguyen
Ohio State University  Oanh Nguyen
Brown University  Konstantin Tikhomirov
Carnegie Mellon University  Roman Vershynin
University of California, Irvine  Van Vu
Yale University
Abstract
This workshop focuses on the role of random matrices in data science, machine learning, and theoretical computer science.
Playing a significant role in modern data science, random matrices provide an elegant way to represent both (a) the data and (b) the way we process it. To give an example of (a), the classical model of highdimensional data is a set of points drawn from a certain distribution in a highdimensional space which can be represented as a random matrix. An even more natural example is that most data in practice is noisy, so it can be represented as a deterministic matrix plus a random noise, which is a random matrix with a nonzero mean. An example of (b) is data compression, which can be realized by applying a random matrix of smaller sizes to the data in (a), thereby reducing its dimensions. Another example is data completion, where we need to reconstruct data (in the form of a matrix) from a random submatrix, given that the original matrix satisfies certain structural constraints (such as being low rank).
To deal with these problems, we need to develop techniques beyond the scope of standard random matrix theory. For example, while the sensitivity of matrix spectrum to small random perturbations is extensively studied in the mean field regime, the problem in the setting of structured random matrices, which allows for more adequate modeling of realworld data, is wide open. As another example, the core mechanism of an artificial neural network is a composition of linear and nonlinear transformations, leading to nonlinear generalizations of random matrices. Our understanding of mathematical principles behind examples like this is still in its infancy.
The goal is to discuss a number of hot topics in these areas, focusing on recent theoretical progress and the potential pool of applications, in a manner that is accessible to a wide audience, including researchers in data science, electrical engineering, statistics, and numerical analysis.
Confirmed Speakers & Participants
Talks will be presented virtually or inperson as indicated in the schedule below.
 Speaker
 Poster Presenter
 Attendee
 Virtual Attendee

Elie Alhajjar
RAND

Lucas Benigni
Université de Montréal

Tatiana Brailovskaya
Princeton University

Jonathan Chavez Casillas
University of Rhode Island

Qianyong Chen
UMass Amherst

Shabarish Chenakkod
University of Michigan

Hongmei Chi
Florida A&M University

Julia E Grigsby
Boston College

Alperen Ergur
UTSA

Zhou Fan
Yale University

Yiyun He
University of California, Irvine

Vishesh Jain
University of Chicago, Illinois

Sungwoo Jeong
Cornell University

Han Le
University of Michigan

Yuqi Lei
Brown University

Sivan Leviyang
Georgetown University

Weiyu Li
Harvard University

Shuangping Li
Stanford University

Wenjian Liu
City University of New York

Ji Meng Loh
New Jersey Institute of Technology

Jackie Lok
Princeton University

Hengrui Luo
Lawrence Berkeley National Laboratory/ Rice University

renyuan ma
Yale University

Benjamin McKenna
Harvard University

Andrea Montanari
Stanford University

Raj Nadakuditi
University of Michigan

Hoi Nguyen
Ohio State University

Oanh Nguyen
Brown University

Mihai Nica
University of Guelph

Petar NizicNikolac
ETH Zurich

Sean O'Rourke
University of Colorado Boulder

Aaradhya Pandey
Princeton University

Jing Qin
University of Kentucky

Liza Rebrova
Princeton

Daniel Reichman
Worcester Polytechnic Institute, Worcester, MA

Elad Romanov
Stanford University

Julian Sahasrabudhe
University of Cambridge

Mehtaab Sawhney
MIT

Calum Shearer
University of Colorado Boulder

Paul Simanjuntak
Texas A&M University

Youngtak Sohn
MIT

Reed Spitzer
Brown University

JINGYE TAN
Cornell University

Konstantin Tikhomirov
Carnegie Mellon University

Linh Tran
Yale University

Phuc Tran
Yale University

Alexander Van Werde
Eindhoven University of Technology

Roman Vershynin
University of California, Irvine

Van Vu
Yale University

Xiong Wang
Johns Hopkins University

Haixiao Wang
University of California, San Diego

Ke Wang
Hong Kong University of Science and Technology

Jingheng Wang
Ohio State University

Zhichao Wang
University of California San Diego

Lu WEI
Texas Tech University

Garrett Wen
Yale University

Haishen Yao
The City University of New York, Queensborough Community College

Ziao Zhang
Brown University

Yifan Zhang
University of Texas at Austin

Yizhe Zhu
University of California Irvine
Workshop Schedule
Monday, May 20, 2024

8:50  9:00 am EDTWelcome11th Floor Lecture Hall
 Session Chair
 Brendan Hassett, ICERM/Brown University

9:00  9:15 am EDTOpening Remarks11th Floor Lecture Hall
 Oanh Nguyen, Brown University
 Hoi Nguyen, Ohio State University
 Konstantin Tikhomirov, Carnegie Mellon University
 Roman Vershynin, University of California, Irvine
 Van Vu, Yale University

9:15  10:15 am EDTMatrix invertibility and applications.Survey  11th Floor Lecture Hall
 Speaker
 Konstantin Tikhomirov, Carnegie Mellon University
 Session Chair
 Van Vu, Yale University

10:15  10:45 am EDTCoffee Break11th Floor Collaborative Space

10:45  11:25 am EDTThe limiting spectral law of sparse random matrices11th Floor Lecture Hall
 Speaker
 Julian Sahasrabudhe, University of Cambridge
 Session Chair
 Van Vu, Yale University
Abstract
Let A be and n by n matrix with iid entries which take 1 with probability p and 0 otherwise. In this talk we will be interested in the distribution of the complex eigenvalues of A as n tends to infinity. Due to an impressive sequence of results, today we know that whenever pn tends to infinity and A is appropriately normalized, the limiting spectral distribution tends to the ""circular law"" and n tends to infinity. However the situation for p = d/n, d fixed, has remained somewhat more mysterious and, in fact, it was not even known if a deterministic limiting law even exists. In this talk I will discuss the recent resolution of this problem, showing that such random matrices have a deterministic limiting spectral distribution. This talk is based on joint work with Ashwin Sah and Mehtaab Sawhney.

11:35 am  12:15 pm EDTImproved very sparse matrix completion via an "asymmetric randomized SVD"11th Floor Lecture Hall
 Speaker
 Raj Rao, University of Michigan
 Session Chair
 Van Vu, Yale University
Abstract
We consider the matrix completion problem in the very sparse regime where, on average, a constant number of entries of the matrix are observed per row (or column). In this very sparse regime, we cannot expect to have perfect recovery and the celebrated nuclear norm based matrix completion fails because the singular value decomposition (SVD) of the underlying very sparse matrix completely breaks down. We demonstrate that it is indeed possible to reliably recover the matrix. The key idea is the use of a ""randomized asymmetric SVD"" to find informative singular vectors in this regime in a way that the SVD cannot. We provide sharp theoretical analysis of the phenomenon, the lower limits of statistical recovery and demonstrate the efficacy of the new method using simulations.

12:25  12:30 pm EDTGroup Photo (Immediately After Talk)11th Floor Lecture Hall

12:30  2:00 pm EDTLunch/Free Time

2:00  2:40 pm EDTUniversality of Spectral Independence with Applications to Fast Mixing in Spin Glasses11th Floor Lecture Hall
 Speaker
 Vishesh Jain, University of Chicago, Illinois
 Session Chair
 Roman Vershynin, University of California, Irvine
Abstract
We study the Glauber dynamics for sampling from distributions on the ndimensional discrete Boolean hypercube. Techniques based on the powerful recent notion of spectral independence have successfully yielded optimal O(n) relaxation times for a host of different distributions. We show that spectral independence is universal: a relaxation time of O(n) implies spectral independence. As an application, we obtain approximate tensorization of entropy and the optimal mixing time for the classically studied mixed pspin model at sufficiently high temperature. Joint work with Nima Anari, Frederic Koehler, Huy Tuan Pham, and ThuyDuong Vuong.

2:50  3:30 pm EDTExtreme eigenvalues of random Laplacian matrices11th Floor Lecture Hall
 Virtual Speaker
 Sean O'Rourke, University of Colorado Boulder
 Session Chair
 Roman Vershynin, University of California, Irvine
Abstract
Many properties of a graph can be deduced from the eigenvalues and eigenvectors of its Laplacian matrix. Because of this, the Laplacian matrix plays a key role in spectral graph theory and is used in algorithms for graph clustering, network control theory, and combinatorial optimization. This talk focuses on the behavior of the extreme eigenvalues of a random Laplacian matrix with Gaussian entries. We show that, after shifting and scaling, the largest eigenvalue converges to a Gumbel distribution as the size of the matrix tends to infinity. The proof uses techniques from random matrix theory and free probability. This talk is based on joint work with Andrew Campbell and Kyle Luh.

3:40  4:10 pm EDTCoffee Break11th Floor Collaborative Space

4:10  4:50 pm EDTAnalysis of singular subspaces under random perturbations11th Floor Lecture Hall
 Speaker
 Ke Wang, Hong Kong University of Science and Technology
 Session Chair
 Roman Vershynin, University of California, Irvine
Abstract
Matrix perturbation theory is a foundational subject across multiple disciplines, including probability, statistics, machine learning, and applied mathematics. Perturbation bounds play a critical role in quantifying the impact of small noise on matrix spectral parameters in various applications such as matrix completion, principal component analysis, and community detection. In this talk, we focus on the additive perturbation model, where a lowrank data matrix is perturbed by Gaussian noise. We provide a comprehensive analysis of the singular subspaces, extending the classical DavisKahanWedin theorem and offering finegrained analysis of the singular vector matrices. Also, we will show the practical implications of our perturbation results, specifically highlighting their application in Gaussian mixture models. The talk is based on joint works with Sean O'Rourke and Van Vu.

5:00  6:30 pm EDTReception11th Floor Collaborative Space
Tuesday, May 21, 2024

9:00  10:00 am EDTA new look at some basic perturbation boundsSurvey  11th Floor Lecture Hall
 Speaker
 Van Vu, Yale University
 Session Chair
 Konstantin Tikhomirov, Carnegie Mellon University
Abstract
In applications, we often need perturbation bounds for the leading spectral parameters of a matrix, such as the leading eigenvectors, eigenvalues, eigensubspaces or low rank approximation. In this talk, we are going to discuss a new kind of perturbation bounds, which would improve significantly classical bounds such as DavisKahan or Weyl, given that the spectrum of the truth matrix decays in a reasonable manner. (based on joined works with S. O'rourke, K. Wang, P. Tran)

10:00  10:30 am EDTCoffee Break11th Floor Collaborative Space

10:30  11:10 am EDTImproved Principal Component Analysis for Sample Covariance Matrix11th Floor Lecture Hall
 Speaker
 Phuc Tran, Yale University
 Session Chair
 Konstantin Tikhomirov, Carnegie Mellon University
Abstract
Using the sample covariance to estimate the principal components of a hidden covariance matrix is a central problem in statistics and data science. We view this as a perturbation problem where the noise is a zero mean random matrix which is the difference between the sample covariance matrix and the truth. In certain special settings, it can also be viewed as a deformed random matrix model. In this talk, using recent progress on matrices with random perturbation, we derive new estimates, which extend and improve several well known results in the field. (join work with V. Vu)

11:20 am  12:00 pm EDTOn a Conjecture of Spielman–Teng11th Floor Lecture Hall
 Speaker
 Mehtaab Sawhney, MIT
 Session Chair
 Konstantin Tikhomirov, Carnegie Mellon University
Abstract
Let J be a mean 0, variance 1 random variable with subGaussian tails. Let M be an n by n matrix with iid entries J and G be an n by n matrix with iid standard normal entries. We prove that for all nonnegative x that P[s_n(M)<x/n^{1/2}] = (1+o(1))P[s_n(G)<x/n^{1/2}] + O(e^{cn}). This resolves up to a (1+o(1)) a conjecture of Spielman and Teng regarding the distribution of the least singular value. This furthermore unifies works of Tao and Vu (on the distribution of the least singular value) and Rudelson and Vershynin (which proved the conjecture up to an absolute constant). Joint with Ashwin Sah and Julian Sahasrabudhe

12:15  2:00 pm EDTLunch/Free Time

2:00  2:40 pm EDTMatrix superconcentration inequalities11th Floor Lecture Hall
 Speaker
 Tatiana Brailovskaya, Princeton University
 Session Chair
 Hoi Nguyen, Ohio State University
Abstract
One way to understand the concentration of the norm of a random matrix X with Gaussian entries is to apply a standard concentration inequality, such as the one for Lipschitz functions of i.i.d. standard Gaussian variables, which yields subgaussian tail bounds on the norm of X. However, as was shown by Tracy and Widom in 1990s, when the entries of X are i.i.d. the norm of X exhibits even sharper concentration. The phenomenon of a function of many i.i.d. variables having strictly smaller tails than those predicted by classical concentration inequalities is sometimes referred to as «superconcentration», a term originally dubbed by Chatterjee. I will discuss novel results that can be interpreted as superconcentration inequalities for the norm of X, where X is a Gaussian random matrix with independent entries and an arbitrary variance profile. We can also view our results as a nonhomogeneous extension of TracyWidomtype upper tail estimates for the norm of X.

2:50  3:30 pm EDTKroneckerproduct random matrices and a matrix least squares problem11th Floor Lecture Hall
 Speaker
 Zhou Fan, Yale University
 Session Chair
 Hoi Nguyen, Ohio State University
Abstract
We study the asymptotic spectral distribution and resolvent of a n^2byn^2 Kroneckerproduct random matrix A x I + I x B + Theta x Xi, when A, B are independent Wigner matrices and Theta, Xi are deterministic and diagonal. For fixed spectral arguments z in C^+, we establish a quantitative approximation for the Stieltjes transform by that of an approximating free operator, and a diagonal deterministic equivalent approximation for the resolvent. We further perform an entrywise analysis of the resolvent, showing that offdiagonal entries fall on two differing scales n^{1/2} and n^{1} depending on their locations with respect to the Kronecker structure. Our study is motivated by consideration of a matrixvalued leastsquares optimization problem min_{X in R^{n x n}} \XA+BX\_F^2 + sum_{ij} Theta_i Xi_j X_{ij}^2 subject to a linear constraint. For random instances of this problem defined by Wigner inputs A and B, our analyses imply an asymptotic characterization of the minimizer X and its associated minimum objective value as in the limit as n > infinity. This is joint work with Renyuan (Jack) Ma.

3:40  4:10 pm EDTCoffee Break11th Floor Collaborative Space

4:10  5:00 pm EDTSolving overparametrized systems of random equationsSurvey  11th Floor Lecture Hall
 Speaker
 Andrea Montanari, Stanford University
 Session Chair
 Hoi Nguyen, Ohio State University
Abstract
I consider the problem of efficiently solving a system of n nonlinear equations in R^d, where each equation is random homogeneous polynomial of arbitrary degrees. In the complex case and for n=d−1, Beltran and Pardo proved the existence of an efficient randomized algorithm and Lairez recently showed it can be derandomized to produce a deterministic efficient algorithm. Here we consider the real setting, to which previously developed methods do not apply. We describe an algorithm that efficiently finds solutions (with high probability) for n=d−O(sqrt{dlog d}). [Based on joint work with Eliran Subag]
Wednesday, May 22, 2024

9:00  10:00 am EDTRecent developments in nonlinear random matricesSurvey  11th Floor Lecture Hall
 Speaker
 Yizhe Zhu, University of California Irvine
 Session Chair
 Oanh Nguyen, Brown University
Abstract
Nonlinear random matrices have emerged as an important area of interest in statistics and machine learning. In recent years, the study of nonlinear random matrix theory has proven valuable in providing rigorous explanations for empirically observed phenomena in deep learning. From a random matrix perspective, this creates new challenges in handling nonlinearity and dependence. In this talk, I will survey recent developments in the spectral analysis of nonlinear random matrices and their applications. This includes discussions on random feature models, kernel random matrices, and random geometric graphs.

10:00  10:30 am EDTCoffee Break11th Floor Collaborative Space

10:30  11:10 am EDTAlgorithmic Thresholds for Ising Perceptron11th Floor Lecture Hall
 Speaker
 Shuangping Li, Stanford University
 Session Chair
 Oanh Nguyen, Brown University
Abstract
The Ising perceptron is a toy model of a singlelayer neural network, which can be viewed as a random constraint satisfaction problem with dense connectivity. The problem requires finding a lengthN vector of +1 or 1 entries, such that its product with a random Gaussian M by N matrix is entrywise lower bounded. We work in the regime where both M and N grow large with a fixed ratio M/N = alpha. In this talk, we present some recent findings on the algorithmic aspects of the Ising perceptron. Inspired by the rich literature on discrepancy minimization, we propose an efficient algorithm that improves on previous algorithms (works for higher values of alpha). We also show that at high enough alpha, the Ising perceptron exhibits the multi Overlap Gap Property (mOGP), which matches our algorithmic threshold up to polylogarithmic factors. This is based on joint work with Tselil Schramm and Kangjie Zhou.

11:20 am  12:00 pm EDTNeural Networks and Products of Random Matrices in the Infinite DepthandWidth Limit11th Floor Lecture Hall
 Speaker
 Mihai Nica, University of Guelph
 Session Chair
 Oanh Nguyen, Brown University
Abstract
Neural networks have become so large that their behaviour can be well approximated by “infinite neural networks”, which are obtained by considering the limit as the number of neurons goes to infinity. However, there are many possible infinite limits one can take! In this talk, I will introduce the "infinite depthandwidth limit", where both the depth and the width are scaled to infinity simultaneously. Mathematically, this limit corresponds to a limit of products of random matrices where *both* the size of each individual matrix and the number of matrices in the product are scaled proportionally.

12:10  2:00 pm EDTLunch/Free Time

2:00  2:40 pm EDTSpectrum of the Neural Tangent Kernel in a quadratic scaling11th Floor Lecture Hall
 Speaker
 Lucas Benigni, Université de Montréal
 Session Chair
 Hoi Nguyen, Ohio State University
Abstract
Despite their surplus of parameters, modern deep learning models often generalize well, a phenomenon exemplified by the "double descent curve." While this behavior is theoretically grasped for problems such as ridge regression under linear scaling of dimensions, intriguing phenomenon emerge under quadratic scaling, where sample size equals parameter count. In this presentation, we study the eigenvalues of the Neural Tangent Kernel, a matrix model pertinent to wide neural networks trained via gradient descent, within this quadratic regime.

2:50  3:30 pm EDTThe HighDimensional Asymptotics of Principal Component Regression11th Floor Lecture Hall
 Speaker
 Elad Romanov, Stanford University
 Session Chair
 Hoi Nguyen, Ohio State University
Abstract
Principal component regression (PCR) is a classical twostep approach to linear regression, where one first reduces the data dimension by projecting onto its large principal components, and then performs ordinary least squares regression. We study PCR in an asymptotic highdimensional regression setting, where the number of data points is proportional to the dimension. We derive exact limiting formulas for the estimation and prediction risks, which depend in a complicated manner on the eigenvalues of the population covariance, the alignment between the population PCs and the true signal, and the number of selected PCs. A key challenge is the fact that in this regime, the sample covariance is an inconsistent estimate of its population counterpart, hence sample PCs may fail to fully capture potential latent lowdimensional structure in the data. We demonstrate this point through several case studies, including that of a spiked covariance model. Joint work with Alden Green.

3:40  4:10 pm EDTCoffee Break11th Floor Collaborative Space
All event times are listed in ICERM local time in Providence, RI (Eastern Daylight Time / UTC4).
All event times are listed in .
ICERM local time in Providence, RI is Eastern Daylight Time (UTC4). Would you like to switch back to ICERM time or choose a different custom timezone?
Request Reimbursement
This section is for general purposes only and does not indicate that all attendees receive funding. Please refer to your personalized invitation to review your offer.
 ORCID iD
 As this program is funded by the National Science Foundation (NSF), ICERM is required to collect your ORCID iD if you are receiving funding to attend this program. Be sure to add your ORCID iD to your Cube profile as soon as possible to avoid delaying your reimbursement.
 Acceptable Costs

 1 roundtrip between your home institute and ICERM
 Flights in economy class to either Providence airport (PVD) or Boston airport (BOS)
 Ground Transportation to and from airports and ICERM.
 Unacceptable Costs

 Flights on U.K. airlines
 Seats in economy plus, business class, or first class
 Change ticket fees of any kind
 Multiuse bus passes
 Meals or incidentals
 Advance Approval Required

 Personal car travel to ICERM from outside New England
 Multipledestination plane ticket; does not include layovers to reach ICERM
 Arriving or departing from ICERM more than a day before or day after the program
 Multiple trips to ICERM
 Rental car to/from ICERM
 Arriving or departing from airport other than PVD/BOS or home institution's local airport
 2 oneway plane tickets to create a roundtrip (often purchased from Expedia, Orbitz, etc.)
 Travel Maximum Contributions

 New England: $350
 Other contiguous US: $850
 Asia & Oceania: $2,000
 All other locations: $1,500
 Note these rates were updated in Spring 2023 and superseded any prior invitation rates. Any invitations without travel support will still not receive travel support.
 Reimbursement Requests

Request Reimbursement with Cube
Refer to the back of your ID badge for more information. Checklists are available at the front desk and in the Reimbursement section of Cube.
 Reimbursement Tips

 Scanned original receipts are required for all expenses
 Airfare receipt must show full itinerary and payment
 ICERM does not offer per diem or meal reimbursement
 Allowable mileage is reimbursed at prevailing IRS Business Rate and trip documented via pdf of Google Maps result
 Keep all documentation until you receive your reimbursement!
 Reimbursement Timing

6  8 weeks after all documentation is sent to ICERM. All reimbursement requests are reviewed by numerous central offices at Brown who may request additional documentation.
 Reimbursement Deadline

Submissions must be received within 30 days of ICERM departure to avoid applicable taxes. Submissions after thirty days will incur applicable taxes. No submissions are accepted more than six months after the program end.