Organizing Committee
Abstract

The goal of this workshop is to bring together mathematicians and data scientists to participate in a discussion of current methods and outstanding problems in data science. The workshop is particularly aimed at mathematicians interested in pursuing research or a career in data science who wish to gain an understanding of this rapidly evolving field and the ways in which mathematics can contribute.

Researchers currently working in data science are also encouraged to attend, to share ideas about mathematical methodologies and challenges. A number of experienced data scientists with a variety of backgrounds from academics, national laboratories, and industry (including startups) will be invited. The program will include overview and technical talks, several panels consisting of practitioners with different experience levels, and one or more poster sessions.

Image for "Mathematics in Data Science"
Image created by Nurcan Durak and provided courtesy of Tamara Kolda.

The image to the right is an illustration of the BTER (block two-level Erdös-Renyí) graph model. The nodes are color-coded: darker nodes are of higher degree. The blue edges correspond to highly-connected affinity blocks, and the green edges to “random” connections. Image created by Nurcan Durak and provided courtesy of Tamara Kolda, based on work at Sandia National Laboratories*. * Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.

Career Panelists
  • June Andrews
    LinkedIn
  • Justin Basilico
    Netflix
  • Tom LaGatta
    Splunk
  • Randall LeVeque
    University of Washington
  • Jake VanderPlas
    University of Washington
  • Bobbie-Jo Webb-Robertson
    Pacific-Northwest National Laboratory
Math in Data Science Panel
  • Justin Basilico
    Netflix
  • Susan Holmes
    Stanford University
  • Xiaoming Huo
    National Science Foundation/Georgia Tech
  • Peter Jones
    Yale University
  • Tamara Kolda
    Sandia National Laboratories
  • Linda Ness
    Applied Communications Sciences
  • Randall LeVeque
    University of Washington
  • Amit Singer
    Princeton University
  • Yi-Qiao Song
    Schlumberger-Doll Research Center

Confirmed Speakers & Participants

Talks will be presented virtually or in-person as indicated in the schedule below.

  • Speaker
  • Poster Presenter
  • Attendee
  • Virtual Attendee

Workshop Schedule

Tuesday, July 28, 2015
TimeEventLocationMaterials
8:30 - 8:50am EDTRegistration11th Floor Collaborative Space 
8:50 - 9:00am EDTWelcome and Introductory Remarks - ICERM Director, Program Organizers11th Floor Lecture Hall 
9:00 - 9:35am EDTThe multi-facets of a data science project to answer: how are organs formed? - Bin Yu, University of California, Berkeley 11th Floor Lecture Hall 
9:45 - 10:20am EDT3D Structure Determination using Cryo-Electron Microscopy - Computational Challenges - Amit Singer, Princeton University 11th Floor Lecture Hall
10:20 - 10:50am EDTCoffee/Tea Break11th Floor Collaborative Space 
10:50 - 11:25am EDTDiamond Sampling for Approximate Maximum All-pairs Dot-product (MAD) Search (*) - Tammy Kolda, Sandia National Laboratories 11th Floor Lecture Hall
11:35 - 12:10pm EDTBig Data Visual Analysis - Chris Johnson, University of Utah 11th Floor Lecture Hall
12:10 - 1:40pm EDTBreak for Lunch   
1:40 - 2:15pm EDTProduct Formalisms for Measures on Spaces with Binary Tree Structures- Representation, Visualization, Inference, Decision and Application - Linda Ness, Applied Communication Sciences11th Floor Lecture Hall
2:25 - 3:00pm EDTFeature Generation for Drug Discovery Learning - Anthony Bak, Ayasdi, Inc. 11th Floor Lecture Hall
3:00 - 3:40pm EDTCoffee/Tea Break11th Floor Collaborative Space 
3:40 - 5:00pm EDTLightning Talks11th Floor Lecture Hall 
5:00 - 6:30pm EDTWelcome Reception11th Floor Collaborative Space 
Wednesday, July 29, 2015
TimeEventLocationMaterials
8:50 - 9:00am EDTIntroductory Remarks - Program Organizers11th Floor Lecture Hall 
9:00 - 9:35am EDTThe Challenges of Heterogeneous Data - Susan Holmes, Stanford University 11th Floor Lecture Hall
9:45 - 10:30am EDTMultiscale Methods for Positive Data and Noise - Peter Jones, Yale University 11th Floor Lecture Hall
10:30 - 10:50am EDTCoffee/Tea Break11th Floor Collaborative Space 
10:50 - 11:25am EDTData Science @ The New York Times - Chris Wiggins, Columbia University11th Floor Lecture Hall
11:35 - 12:10pm EDTStudy of diffusion dynamics from multi-point correlation functions - Yi-Qiao Song, Schlumberger-Doll Research 11th Floor Lecture Hall
12:10 - 12:20pm EDTGroup Photo  
12:20 - 1:40pm EDTBreak for Lunch   
1:40 - 2:15pm EDTThe Decade of Linearity- How ax plus b transformed Search, Jobs, and Health - June Andrews, Noom11th Floor Lecture Hall
2:25 - 3:00pm EDTStructured Regression in Evolving Health Networks - Zoran Obradovich, Temple University 11th Floor Lecture Hall
3:00 - 3:30pm EDTCoffee/Tea Break11th Floor Collaborative Space 
3:30 - 4:05pm EDTLightning Talks11th Floor Lecture Hall 
4:15 - 5:15pm EDTCareer Panel11th Floor Lecture Hall 
5:15 - 6:30pm EDTPoster Session11th Floor Lecture Hall and Collaborative Space 
Thursday, July 30, 2015
TimeEventLocationMaterials
8:50 - 9:00am EDTIntroductory Remarks - Program Organizers11th Floor Lecture Hall 
9:00 - 9:35am EDTPersonalized Page Generation using Data, Science, and Algorithms - Justin Basilico, Netflix 11th Floor Lecture Hall
9:45 - 10:20am EDTSearching for Structure in Network Science - Blair Sullivan, North Carolina State University 11th Floor Lecture Hall
10:20 - 10:50am EDTCoffee/Tea Break11th Floor Collaborative Space 
10:50 - 11:25am EDTFast Steerable Principal Component Analysis - Jane Zhao, New York University 11th Floor Lecture Hall 
11:35 - 12:10pm EDTMathematics in Data Science Panel 11th Floor Lecture Hall 
12:10 - 1:40pm EDTBreak for Lunch   
1:40 - 2:15pm EDTA project in the life of a data scientist - Janine Bennett, Sandia National Laboratories 11th Floor Lecture Hall
2:25 - 3:00pm EDTThermostatic Controls for Noisy Gradient Systems and Applications to Machine Learning - Ben Leimkuhler, University of Edinburgh 11th Floor Lecture Hall
3:00 - 3:30pm EDTCoffee/Tea Break11th Floor Collaborative Space 
3:30 - 4:05pm EDTScalable Bayes via Barycenter in Wasserstein Space - David Dunson, Duke University 11th Floor Lecture Hall 

Lecture Videos

Searching for Structure in Network Science

Blair Sullivan
North Carolina State University
July 30, 2015

Data Science at The New York Times

Chris Wiggins
New York Times
July 29, 2015

The Challenges of Heterogeneous Data

Susan Holmes
Stanford University
July 29, 2015