Organizing Committee
Abstract

The goal of this workshop is to bring together mathematicians and data scientists to participate in a discussion of current methods and outstanding problems in data science. The workshop is particularly aimed at mathematicians interested in pursuing research or a career in data science who wish to gain an understanding of this rapidly evolving field and the ways in which mathematics can contribute.

Researchers currently working in data science are also encouraged to attend, to share ideas about mathematical methodologies and challenges. A number of experienced data scientists with a variety of backgrounds from academics, national laboratories, and industry (including startups) will be invited. The program will include overview and technical talks, several panels consisting of practitioners with different experience levels, and one or more poster sessions.

Image created by Nurcan Durak and provided courtesy of Tamara Kolda.

The image to the right is an illustration of the BTER (block two-level Erdös-Renyí) graph model. The nodes are color-coded: darker nodes are of higher degree. The blue edges correspond to highly-connected affinity blocks, and the green edges to “random” connections. Image created by Nurcan Durak and provided courtesy of Tamara Kolda, based on work at Sandia National Laboratories*. * Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.

Career Panelists
  • June Andrews
    LinkedIn
  • Justin Basilico
    Netflix
  • Tom LaGatta
    Splunk
  • Randall LeVeque
    University of Washington
  • Jake VanderPlas
    University of Washington
  • Bobbie-Jo Webb-Robertson
    Pacific-Northwest National Laboratory
Math in Data Science Panel
  • Justin Basilico
    Netflix
  • Susan Holmes
    Stanford University
  • Xiaoming Huo
    National Science Foundation/Georgia Tech
  • Peter Jones
    Yale University
  • Tamara Kolda
    Sandia National Laboratories
  • Linda Ness
    Applied Communications Sciences
  • Randall LeVeque
    University of Washington
  • Amit Singer
    Princeton University
  • Yi-Qiao Song
    Schlumberger-Doll Research Center

Confirmed Speakers & Participants

Workshop Schedule

Tuesday, July 28, 2015
TimeEventLocationMaterials
8:30 - 8:50Registration11th Floor Collaborative Space 
8:50 - 9:00Welcome and Introductory Remarks - ICERM Director, Program Organizers11th Floor Lecture Hall 
9:00 - 9:35The multi-facets of a data science project to answer: how are organs formed? - Bin Yu, University of California, Berkeley 11th Floor Lecture Hall 
9:45 - 10:203D Structure Determination using Cryo-Electron Microscopy - Computational Challenges - Amit Singer, Princeton University 11th Floor Lecture Hall
10:20 - 10:50Coffee/Tea Break11th Floor Collaborative Space 
10:50 - 11:25Diamond Sampling for Approximate Maximum All-pairs Dot-product (MAD) Search (*) - Tammy Kolda, Sandia National Laboratories 11th Floor Lecture Hall
11:35 - 12:10Big Data Visual Analysis - Chris Johnson, University of Utah 11th Floor Lecture Hall
12:10 - 1:40Break for Lunch   
1:40 - 2:15Product Formalisms for Measures on Spaces with Binary Tree Structures- Representation, Visualization, Inference, Decision and Application - Linda Ness, Applied Communication Sciences11th Floor Lecture Hall
2:25 - 3:00Feature Generation for Drug Discovery Learning - Anthony Bak, Ayasdi, Inc. 11th Floor Lecture Hall
3:00 - 3:40Coffee/Tea Break11th Floor Collaborative Space 
3:40 - 5:00Lightning Talks11th Floor Lecture Hall 
5:00 - 6:30Welcome Reception11th Floor Collaborative Space 
Wednesday, July 29, 2015
TimeEventLocationMaterials
8:50 - 9:00Introductory Remarks - Program Organizers11th Floor Lecture Hall 
9:00 - 9:35The Challenges of Heterogeneous Data - Susan Holmes, Stanford University 11th Floor Lecture Hall
9:45 - 10:30Multiscale Methods for Positive Data and Noise - Peter Jones, Yale University 11th Floor Lecture Hall
10:30 - 10:50Coffee/Tea Break11th Floor Collaborative Space 
10:50 - 11:25Data Science @ The New York Times - Chris Wiggins, Columbia University11th Floor Lecture Hall
11:35 - 12:10Study of diffusion dynamics from multi-point correlation functions - Yi-Qiao Song, Schlumberger-Doll Research 11th Floor Lecture Hall
12:10 - 12:20Group Photo  
12:20 - 1:40Break for Lunch   
1:40 - 2:15The Decade of Linearity- How ax plus b transformed Search, Jobs, and Health - June Andrews, Noom11th Floor Lecture Hall
2:25 - 3:00Structured Regression in Evolving Health Networks - Zoran Obradovich, Temple University 11th Floor Lecture Hall
3:00 - 3:30Coffee/Tea Break11th Floor Collaborative Space 
3:30 - 4:05Lightning Talks11th Floor Lecture Hall 
4:15 - 5:15Career Panel11th Floor Lecture Hall 
5:15 - 6:30Poster Session11th Floor Lecture Hall and Collaborative Space 
Thursday, July 30, 2015
TimeEventLocationMaterials
8:50 - 9:00Introductory Remarks - Program Organizers11th Floor Lecture Hall 
9:00 - 9:35Personalized Page Generation using Data, Science, and Algorithms - Justin Basilico, Netflix 11th Floor Lecture Hall
9:45 - 10:20Searching for Structure in Network Science - Blair Sullivan, North Carolina State University 11th Floor Lecture Hall
10:20 - 10:50Coffee/Tea Break11th Floor Collaborative Space 
10:50 - 11:25Fast Steerable Principal Component Analysis - Jane Zhao, New York University 11th Floor Lecture Hall 
11:35 - 12:10Mathematics in Data Science Panel 11th Floor Lecture Hall 
12:10 - 1:40Break for Lunch   
1:40 - 2:15A project in the life of a data scientist - Janine Bennett, Sandia National Laboratories 11th Floor Lecture Hall
2:25 - 3:00Thermostatic Controls for Noisy Gradient Systems and Applications to Machine Learning - Ben Leimkuhler, University of Edinburgh 11th Floor Lecture Hall
3:00 - 3:30Coffee/Tea Break11th Floor Collaborative Space 
3:30 - 4:05Scalable Bayes via Barycenter in Wasserstein Space - David Dunson, Duke University 11th Floor Lecture Hall 

Lecture Videos

Searching for Structure in Network Science

Blair Sullivan
North Carolina State University
July 30, 2015

Data Science at The New York Times

Chris Wiggins
Columbia University
July 29, 2015

The Challenges of Heterogeneous Data

Susan Holmes
Stanford University
July 29, 2015