Organizing Committee
- Thomas Cass
Imperial College London - Terry Lyons
University of Oxford - Hao Ni
University College London and Alan Turing Institute - Harald Oberhauser
University of Oxford - Mihaela van der Schaar
University of Cambridge
Abstract
Rough path theory emerged as a branch of stochastic analysis to give an improved approach to dealing with the interactions of complex random systems. In that context, it continues to resolve important questions, but its broader theoretical footprint has been substantial. Most notable is its contribution to Hairer’s Fields-Medal-winning work on regularity structures. At the core of rough path theory is the so-called signature transform which, while being simple to define, has rich mathematical properties bringing in aspects of analysis, geometry, and algebra. Hambly and Lyons (Annals of Math, 2010) built upon earlier work of Chen, showing how the signature represents the path uniquely up to generalized reparameterizations. This turns out to have practical implications allowing one to summarise the space of functions on unparameterized paths and data streams in a very economical way.
Over the past five years, a significant strand of applied work has been undertaken to exploit the mathematical richness of this object in diverse data science challenges from healthcare, to computer vision to gesture recognition. The log signature is becoming a powerful way to summarise the fine structure of a data stream in a neural net. The emergence of neural differential equations as an important tool in data science further deepens the connections with rough paths.
This four-day workshop will bring together key expertise across disciplines to advance understanding of some of the most pressing and exciting challenges. The week will start with structured, tutorial-style lectures on the foundational aspects of signatures their use in data science, and topics of broad appeal. These will include:
- Mathematical foundations
- Neural rough differential equations
- Signature-based kernel methods
- Expected signatures
- Applications of signatures to action recognition and healthcare
The rest of the workshop will consist of technical talks and contributed talks from participants. There will be both organized discussions as well as opportunities for informal interaction.
To be considered for the practical portions of the workshop please apply by June 20, 2021.

Confirmed Speakers & Participants
Talks will be presented virtually or in-person as indicated in the schedule below.
- Speaker
- Poster Presenter
- Attendee
- Virtual Attendee
-
Elie Alhajjar
US Military Academy
-
Sergio Almada Monter
Morgan Stanley
-
Nicholas B
GCHQ
-
Fabrice Baudoin
University of Connecticut
-
Daniel Bedau
Western Digital
-
Ryne Beeson
CU Aerospace LLC
-
Jose Blanchet
Columbia University and Stanford University
-
Linus Bleistein
Inria Paris
-
Robert Boissy
SeqStream PBC
-
Michael Brenner
Harvard University
-
Luis Carvalho
Boston University
-
Thomas Cass
Imperial College London
-
Ricky Tian Qi Chen
University of Toronto
-
Danilo de Barros
Nomura
-
Joscha Diehl
University of Greifswald
-
Jing Dong
Columbia Business School
-
Bruce Driver
University of California, San Diego
-
Weinan E
Princeton University
-
Kurusch Ebrahimi-Fard
Norwegian University of Science and Technology NTNU
-
Adeline Fermanian
LPSM
-
Emilio Ferrucci
Imperial Collete
-
James Foster
University of Oxford
-
Peter Foster
Alan Turing Institute
-
Mazyar Ghani Varzaneh
TU Berlin
-
Jonathan H
Unaffiliated
-
Wesley Hamilton
University of Utah
-
boumediene hamzi
ICL
-
Darryl Holm
Imperial College London
-
Blanka Horvath
King's College London
-
Songyan Hou
ETH
-
Kevin Hu
Brown University
- Xinru Hua
-
Tomoyuki Ichiba
University of California Santa Barbara
-
Drago Indjic
Oxquant
-
Arman Khaledian
Bank of America
-
Patrick Kidger
University of Oxford
-
Tom L
Alan Turing Institute
-
Jeroen Lamb
Imperial College London
-
Adrien Laurent
University of Geneva
-
Darrick Lee
University of Pennsylvania
-
Maud Lemercier
University of Warwick
-
Jose Henry Leon-Janampa
JHL Quantitative Analysis Ltd.
-
Terry Lyons
University of Oxford
-
Firdous Mala
GDC Sopore, Bla
-
Thomas Mellan
Imperial College London
-
Remy Messadene
Imperial College London
-
Thibault Meyers
ENSTA Paris
-
Ming Min
University of California, Santa Barbara
-
Swapnil Mishra
Imperial College London
-
Eduardo Mojica-Nava
Universidad Nacional de Colombia
-
Sam Morley
University of Oxford
-
James Morrill
University of Oxford
-
Hao Ni
University College London and Alan Turing Institute
-
Harald Oberhauser
University of Oxford
-
Yanni Papandreou
Imperial College London
-
Anastasia Papavasileiou
Warwick University
-
Zhijun (George) Qiao
University of Texas Rio Grande Valley
-
Jeremy Reizenstein
Facebook
-
Marwan Sakran
Universität Greifswald
-
Cristopher Salvi
University of Oxford
-
Guillermo Sapiro
Duke University
-
Kevin Schlegel
The Alan Turing Institute
-
leonard schmitz
university of greifswald
-
Anna Seigal
University of Oxford
-
Yeonjong Shin
Brown University
-
Farrokh Shirjian
Tarbiat Modares University
-
Nikolas Tapia
Weierstrass Institute
-
Csaba Toth
University of Oxford
-
Trang Tran
UCSC
-
William Turner
Imperial College London
-
Mihaela van der Schaar
University of Cambridge
-
Roberto Velho
Federal University of Rio Grande do Sul
-
Bo Wang
Massachusetts General Hospital
-
Niklas Weber
LMU Munich
-
Stephan Wojtowytsch
Princeton University
-
Yue Wu
University of Oxford
-
George Wynne
Imperial College London
-
Xinyi Xiang
Google Developers Group
-
Wei Xiong
University of Oxford
-
Masanao Yajima
Boston Universisty
-
Weixin Yang
The Alan Turing Institute
-
Haiyan Yu
Penn State University
-
Vasilis Zafiris
University of Houston-Downtown
-
Xin (Cindy) Zhang
South China University of Technology
Workshop Schedule
Tuesday, July 6, 2021
-
8:50 - 9:00 am EDTWelcomeVirtual
- Kavita Ramanan, Brown University
-
9:00 - 10:00 am EDTTutorial: Mathematical foundations of the signatureVirtual
- Terry Lyons, University of Oxford
- Harald Oberhauser, University of Oxford
-
10:05 - 10:30 am EDTTutorial: Signatures in the WildVirtual
- Peter Foster, Alan Turing Institute
-
10:35 - 11:00 am EDTML Without Unnecessary Harm: Blind Pareto Fairness and Subgroup RobustnessVirtual
- Guillermo Sapiro, Duke University
Abstract
With the wide adoption of machine learning algorithms across various application domains, there is a growing interest in the fairness properties of such algorithms. The vast majority of the activity in the field of group fairness addresses disparities between predefined groups based on protected features such as gender, age, and race, which need to be available at train, and often also at test, time. These approaches are static and retrospective, since algorithms designed to protect groups identified a priori cannot anticipate and protect the needs of different at-risk groups in the future. In this work we analyze the space of solutions for worst-case fairness beyond demographics, and propose Blind Pareto Fairness (BPF), a method that leverages no-regret dynamics to recover a fair minimax classifier that reduces worst-case risk of any potential subgroup of sufficient size, and guarantees that the remaining population receives the best possible level of service. BPF addresses fairness beyond demographics, that is, it does not rely on predefined notions of at-risk groups, neither at train nor at test time. Our experimental results show that the proposed framework improves worst-case risk in multiple standard datasets, while simultaneously providing better levels of service for the remaining population, in comparison to competing methods.
-
11:05 - 11:30 am EDTCoffee BreakVirtual
-
11:30 am - 12:00 pm EDTTutorial: Signatures in the WildVirtual
- James Morrill, University of Oxford
-
12:00 - 12:30 pm EDTCoffee BreakVirtual
-
12:30 - 1:00 pm EDTInfancy Longitudinal Structural MRI Data Analysis with Path Signature Features for the Cognitive Scores PredictionVirtual
- Xin (Cindy) Zhang, South China University of Technology
Abstract
Path signature has unique advantages on extracting high order differential features of sequential data. Our team has been studying the path signature theory and actively applied it to various applications, including infant cognitive score prediction, human motion recognition, hand-written character recognition, hand-written text line recognition and writer identification etc. In this talk, I will share our most recent works on infant cognitive score prediction using learnable path signature features and simple deep learning models. The cognitive score can reveal individual’s abilities on intelligence, motion, language abilities. Recent research discovered that the cognitive ability is closely related with individual’s cortical structure and its development. We have proposed two frameworks to predict the cognitive score with different path signature features. For the first framework, we construct the temporal path signature along the age growth and extract signature features from longitudinal structural MRI data. By incorporating the cortical temporal path signature into the multi-stream deep learning model, the individual cognitive score can be predicted, even with missing data issues. For the second framework, we propose the learnable path signature algorithm to compute the developmental feature. Further, we obtain the brain region-wise development graph for the first two-year infant. Then we have employed the graph convolutional network for the score prediction. These two frameworks have been tested on two in-house cognitive data sets and reached state-of-the-art results.
-
1:00 - 2:30 pm EDTLunch/Free TimeVirtual
-
2:30 - 3:30 pm EDTPractical Session 1 : Computing some examplesVirtual
- Peter Foster, Alan Turing Institute
- Sam Morley, University of Oxford
Wednesday, July 7, 2021
-
9:00 - 9:30 am EDTTutorial: Log-signatures and Neural Rough Differential EquationsVirtual
- James Foster, University of Oxford
-
9:35 - 10:05 am EDT
-
10:10 - 10:40 am EDT
-
10:40 - 11:10 am EDTCoffee BreakVirtual
-
11:10 - 11:35 am EDTDistribution Regression for Sequential DataVirtual
- Maud Lemercier, University of Warwick
Abstract
Distribution regression on sequential data describes the task of learning a function from a group of time series to a single scalar target. I will present a generic framework, based on the expected signature, which enables to compactly summarise a cloud of time series and make decisions on it. I will then demonstrate empirically how this framework achieves state-of-the-art performance on both synthetic and real-world examples from thermodynamics, mathematical finance and agricultural science.
-
11:40 am - 3:30 pm EDTLunch/Free TimeVirtual
-
3:30 - 5:00 pm EDTPractical Session: Computing some examplesVirtual
- Peter Foster, Alan Turing Institute
- Sam Morley, University of Oxford
Thursday, July 8, 2021
-
9:00 - 9:30 am EDTTutorial: Generative models and Signature-Based Machine Learning ModelsVirtual
- Hao Ni, University College London and Alan Turing Institute
-
9:35 - 10:05 am EDTTutorial: Path recovery from signature feature representationVirtual
- Weixin Yang, The Alan Turing Institute
Abstract
To recover the underlying paths from a given signature representation can not only give confidence in using signature as features in machine learning tasks but also be useful for data augmentation or key-frame extraction. The "Signatory" python package allows the signature transformation to act as a layer in a trainable neural network. Inspired by it, we tried to recover different types of underlying paths from a given signature-based representation. By visualizing the results, we discussed the effects of different hyper-parameters for our signature feature set.
-
10:10 - 10:40 am EDTTutorial: Gaussian ProcessesVirtual
- Csaba Toth, University of Oxford
-
10:45 - 11:15 am EDTCoffee BreakVirtual
-
11:15 - 11:45 am EDTFraming RNN as a kernel method: A neural ODE approachVirtual
- Adeline Fermanian, LPSM
Abstract
Building on the interpretation of a recurrent neural network (RNN) as a continuous- time neural differential equation, we show, under appropriate conditions, that the solution of a RNN can be viewed as a linear function of a specific feature set of the input sequence, known as the signature. This connection allows us to frame a RNN as a kernel method in a suitable reproducing kernel Hilbert space. As a consequence, we obtain theoretical guarantees on generalization and stability for a large class of recurrent networks. Our results are illustrated on simulated datasets.
-
11:45 am - 12:10 pm EDT
-
12:10 - 1:30 pm EDTLunch/Free TimeVirtual
-
1:40 - 2:05 pm EDTMachine Learning for PDEsVirtual
- Michael Brenner, Harvard University
Abstract
I will discuss methods for using machine learning to speed up solutions of nonlinear partial differential equations, focusing on learning discretizations for coarse graining the numerical solutions of PDEs. I will start with examples in 1d, and then move on to the Navier Stokes equation.
-
2:05 - 2:30 pm EDTData -Driven Market Simulators some simple applications of signature kernel methods in mathematical financeVirtual
- Blanka Horvath, King's College London
Abstract
Techniques that address sequential data have been a central theme in machine learning research in the past years. More recently, such considerations have entered the field of finance-related ML applications in several areas where we face inherently path dependent problems: from (deep) pricing and hedging (of path-dependent options) to generative modeling of synthetic market data, which we refer to as market generation.
We revisit Deep Hedging from the perspective of the role of the data streams used for training and highlight how this perspective motivates the use of highly accurate generative models for synthetic data generation. From this, we draw conclusions regarding the implications for risk management and model governance of these applications, in contrast torisk-management in classical quantitative finance approaches.
Indeed, financial ML applications and their risk-management heavily rely on a solid means of measuring and efficiently computing (smilarity-)metrics between datasets consisting of sample paths of stochastic processes. Stochastic processes are at their core random variables with values on path space. However, while the distance between two (finite dimensional) distributions was historically well understood, the extension of this notion to the level of stochastic processes remained a challenge until recently. We discuss the effect of different choices of such metrics while revisiting some topics that are central to ML-augmented quantitative finance applications (such as the synthetic generation and the evaluation of similarity of data streams) from a regulatory (and model governance) perpective. Finally, we discuss the effect of considering refined metrics which respect and preserve the information structure (the filtration) of the marketand the implications and relevance of such metrics on financial results. -
3:35 - 4:15 pm EDTCoffee BreakVirtual
-
4:15 - 4:45 pm EDTTutorial: Action recognition from landmark dataVirtual
- Kevin Schlegel, The Alan Turing Institute
-
4:50 - 5:20 pm EDTTutorial: Two transforms for signature featuresVirtual
- Yue Wu, University of Oxford
Abstract
In this tutorial, I will introduce two transforms that work with signature transforms, one is designed to handle missing data, and the other one is designed to embed the effect of the absolute position of the data stream into signature features in a unified and efficient way.
Friday, July 9, 2021
-
9:00 - 9:25 am EDTSignatures, tensor decompositions, and nonlinear algebraVirtual
- Anna Seigal, University of Oxford
Abstract
I will begin by discussing tensor rank, and the equations that arise when decomposing tensors into rank one terms. I will then consider decompositions of signature tensors, and their systems of equations. Along the way, I will mention joint work with Max Pfeffer and Bernd Sturmfels, and with Terry Lyons and Cris Salvi.
-
9:30 - 9:55 am EDTespilon-Strong Simulation of Stochastic Differential Equations Driven by Levy ProcessesVirtual
- Jing Dong, Columbia Business School
Abstract
Consider a stochastic differential equation dY(t)=f(X(t))dX(t), where X(t) is a pure jump Levy process with finite p-variation, 1<= p < 2, and f is alpha-Lipschitz for some alpha>p. Following the geometric solution construction of Levy-driven stochastic differential equations, we develop a class of epsilon-strong simulation algorithms that allows us to construct a probability space, supporting both Y and a fully simulatable process Y_epsilon, such that Y_epsilon is within epsilon distance from Y under the Skorokhod J1 topology on compact time intervals with probability 1. Moreover, the user can adaptively refine the accuracy levels. This tolerance-enforcement feature allows us to easily combine our algorithm with multilevel Monte Carlo method for efficient estimation of expectations, and adding as a benefit a straightforward analysis of the rate of convergence.
-
10:05 - 10:30 am EDTStochastic gradient descent for noise with ML-type scalingVirtual
- Stephan Wojtowytsch, Princeton University
Abstract
There are two types of convergence results for stochastic gradient descent: (1) SGD finds minimizers of convex objective functions and (2) SGD finds critical points of smooth objective functions. We show that, if the objective landscape and noise possess certain properties which are reminiscent of deep learning problems, then we can obtain global convergence guarantees of first type under second type assumptions for a fixed (small, but positive) learning rate. The convergence is exponential, but with a large random coefficient. If the learning rate exceeds a certain threshold, we discuss minimum selection by studying the invariant distribution of a continuous time SGD model. We show that at a critical threshold, SGD prefers minimizers where the objective function is 'flat' in a precise sense.
-
10:35 am - 12:30 pm EDTLunch/Free TimeVirtual
-
12:30 - 1:15 pm EDTDiscussion of ChallengesVirtual
-
1:15 - 1:25 pm EDTCoffee BreakVirtual
-
1:25 - 1:40 pm EDTOpen SessionVirtual
-
1:40 - 2:35 pm EDTCoffee BreakVirtual
-
2:35 - 3:00 pm EDTControlled Rough Paths RevisitedVirtual
- Bruce Driver, University of California, San Diego
Abstract
In this talk, I will discuss some of the details needed for the theory of controlled rough paths in the infinite dimensional Banach space setting. A key point will be that one may use truncated signatures of smooth curves as ``generating functions'' of the algebraic identities needed to make the theory work. This is a report on work in progress.
-
3:00 - 3:25 pm EDTExact Sampling of Stochastic Differential EquationsVirtual
- Jose Blanchet, Columbia University and Stanford University
Abstract
We consider the problem of generating exact samples at finitely many locations from the solution of a generic multidimensional stochastic differential equation (SDE) driven by Brownian motion at a given time. If the SDE can be transformed into one with a constant diffusion coefficient and gradient drift exact samples can be obtained by sequential acceptance / rejection. In general, in this talk, we will explain how to use the theory of rough paths to obtain such exact samples. This is the first generic algorithm for exact samples of generic multivariate diffusions.
-
3:30 - 4:00 pm EDTClosing RemarksVirtual
All event times are listed in ICERM local time in Providence, RI (Eastern Standard Time / UTC-5).
All event times are listed in .
ICERM local time in Providence, RI is Eastern Standard Time (UTC-5). Would you like to switch back to ICERM time or choose a different custom timezone?