Organizing Committee

Building systems that can understand visual concepts and describe them coherently in natural language is fundamental to artificial intelligence. Advances in machine learning have had profound impact on computer vision and natural language processing. There has been interesting progress in recent years at the intersection of these two fields, producing systems that describe (eg., caption) images and videos captured by personal cameras in ordinary scenes and street views. Much work remains in this and a host of related problems, including that of building natural language descriptions of commercial overhead imagery and videos, where automation is greatly needed: "If we were to attempt to manually exploit the commercial satellite imagery we expect to have over the next 20 years, we would need eight million imagery analysts" [Robert Cardillo, NGA Director, GEOINT Symposium 2017]. This workshop brings together researchers in machine learning, computer vision, natural language processing to discuss best practices in machine generated descriptions for both consumer and overhead imagery and videos. Participants will identify challenges, and recommend future research topics.

Image for "Image Description for Consumer and Overhead Imagery"

Confirmed Speakers & Participants

Talks will be presented virtually or in-person as indicated in the schedule below.

  • Speaker
  • Poster Presenter
  • Attendee
  • Virtual Attendee

Workshop Schedule

Monday, February 25, 2019
8:30 - 8:55am ESTRegistration - ICERM 121 South Main Street, Providence RI 0290311th Floor Collaborative Space 
8:55 - 9:00am ESTWelcome - ICERM Director11th Floor Lecture Hall 
9:00 - 9:45am ESTData Programming for Imaging Problems - Leveraging Domain Expertise, Auxiliary Modalities, and Multiple Tasks - Jared Dunnmon, Stanford University11th Floor Lecture Hall
10:00 - 10:30am ESTCoffee/Tea Break 11th Floor Collaborative Space 
10:30 - 11:15am ESTJointly Generating Image Captions to Aid Visual Question Answering - Raymond Mooney, The University of Texas at Austin11th Floor Lecture Hall
11:30 - 12:15pm ESTUncertainty quantification in graph-based classification of high dimensional data - Andrea Bertozzi, UCLA11th Floor Lecture Hall
12:30 - 2:30pm ESTBreak for Lunch / Free Time  
2:30 - 3:15pm ESTGeometric Deep Learning for Monocular Object Orientation Estimation - Rene Vidal, Johns Hopkins University11th Floor Lecture Hall
3:30 - 4:00pm ESTCoffee/Tea Break11th Floor Collaborative Space 
4:00 - 4:45pm ESTMedical Image Report Generation and Beyond - Zhiting Hu, Carnegie Mellon University11th Floor Lecture Hall
5:00 - 6:30pm ESTWelcome Reception11th Floor Collaborative Space 
Tuesday, February 26, 2019
9:00 - 9:45am ESTAdding Geometry to Learning - Guillermo Sapiro, Duke University11th Floor Lecture Hall 
10:00 - 10:30am ESTCoffee/Tea Break 11th Floor Collaborative Space 
10:30 - 11:15am ESTSay What You Want to Find - Query-based Search in Images and Video - Kate Saenko, Boston University11th Floor Lecture Hall
11:30 - 11:40am ESTGroup Photo11th Floor Lecture Hall 
11:40 - 1:30pm ESTBreak for Lunch / Free Time  
1:30 - 2:15pm ESTView-Invariant Change Captioning - Trevor Darrell, University of California, Berkeley11th Floor Lecture Hall
2:30 - 3:15pm ESTUnderstanding hyperspectral images of planetary surfaces - Mario Parente, University of Massachusetts Amherst11th Floor Lecture Hall 
3:30 - 4:00pm ESTCoffee/Tea Break11th Floor Collaborative Space 
4:00 - 4:45pm ESTExplainable Query Refinement and Overhead Imagery Analytics - Arslan Basharat, Kitware11th Floor Lecture Hall

Associated Semester Workshops

Computer Vision
Image for "Computer Vision"
Algebraic Vision Research Cluster
Image for "Algebraic Vision Research Cluster"
Computational Imaging
Image for "Computational Imaging"
Introduction to the ANTs Ecosystem
Image for "Introduction to the ANTs Ecosystem"