September 19, 2022

Machine Learning and Data Analytics Symposium – MLDAS 2022

Date: 17-18 October, 2022

Location: Cambridge, Massachusetts at the Boeing Aerospace and Autonomy Center on the MIT Campus.

Introduction

The purpose of this symposium is to bring together researchers, practitioners, students, and industry experts in the fields of machine learning, data mining, and related areas to present recent advances, discuss open research questions, and bridge the gap between data analytics research and industry needs on certain concrete problems.

The MLDAS symposium will serve as a platform for exchange of ideas, identification of important and challenging applications, and discovery of possible synergies.

The central topics of MLDAS 2022 will be Robust Learned Models and AI Systems, AI Methods for Design, and Machine Learning for Soccer.

We will address the topics of interest through both invited and contributed talks describing (1) research ideas, (2) new challenges, (3) mature research and (4) practical results. The symposium program will consist of presentations by invited speakers from both academia and industry, and by the authors of the papers submitted to the symposium. In addition, we will have multiple panel discussions to debate important research and strategic problems and applications.

Registration

Registration for MLDAS is free. At the same time, all attendees will be required to register for in-person or virtual participation.

To register, please send an email to registration@mldas.org

The e-mail should include the preference for in-person or virtual participation, name, affiliation (institution and address), contact email and contact telephone number, website (if available), and a brief statement of interest describing the reasons for attending the symposium. For example: “I would like to attend MLDAS because my interests are in machine learning and bioinformatics and I would like to apply machine learning techniques on the data that we have available.”

The organizers will be sending you a confirmation message for your registration.

Organization

MLDAS is organized by the Qatar Computing Research Institute (QCRI) and by The Boeing Company.

Chairs

MLDAS is co-chaired by Sanjay Chawla (QCRI) and Dragos Margineantu (Boeing Research & Technology).
Please click here to contact the MLDAS co-chairs.

Local Chairs

Blake Edwards, Boeing Research & Technology

Registration Chairs

Amani AlOnazi, Boeing Research & Technology

Webmaster

Any questions regarding the organization or the schedule of the symposium should be addressed to the organizers of the symposium.

Any questions regarding local matters (such as visas, lodging, or general information about Doha) should be addressed to the local chair.

Agenda

For more information on the agenda of MLDAS 2022, please follow this link.

Talks

  • Welcome and Opening Remarks
    • Jinnah Hosein | Boeing Senior VP Software Engineering
    • Ashraf Aboulnaga | QCRI Chief Scientist

 

  • Title and abstract following soon
    • Kilian Weinberger | Cornell University

 

  • Title and abstract following soon
    • Martin Rinard | MIT

 

  • Towards verifying neural-symbolic multi-agent systems. Abstract

    A challenge in the deployment of multi-agent systems (MAS)
    remains the inherent difficulty of predicting with confidence their
    run-time behaviour. Over the past twenty years, increasingly scalable
    verification methods, including model checking and parameterised
    verification, have enabled the validation of several classes of MAS
    against AI-based specifications, and several MAS applications in
    services, robotics, security, and beyond.

    Yet, a new class of agents is emerging in applications. Differently from
    traditional MAS, which are typically directly programmed (and less often
    purely neural), they combine both connectionist and symbolic aspects. We
    will refer to these as neural-symbolic MAS. These agents include a
    neural layer, often implementing a perception function, and symbolic or
    control-based layers, typically realising decision making and planning.
    Implementations of neural-symbolic agents permeate many present and
    forthcoming AI applications, including autonomous vehicles and robotics.
    Due to the neural layer, as well as their heterogeneity, verifying the
    behaviours of neural-symbolic MAS is particularly challenging. Yet, I
    will argue that, given the safety-critical applications they are used
    in, methods and tools to address their formal verification should be
    developed.

    In this talk I will share some of the contributions on this topic
    developed at the Verification of Autonomous Systems Lab at Imperial
    College London. I will begin by describing traditional approaches for
    the verification of symbolic MAS, and parameterised verification to
    address arbitrary collections of agents such as swarms. I will then
    summarise our present efforts on verification of neural perception
    systems, including MILP-based approaches, linear relaxations, and
    symbolic interval propagation, introduce our resulting toolkits, Venus
    and Verinet, and exemplify their use.

    This will lead us to existing methods for closed-loop, neural-symbolic
    MAS. In this context, I will share existing results that enable us to
    perform reachability analysis, and verify systems against bounded
    temporal specifications and ATL.

    I will conclude by highlighting some of the many challenges that lie ahead.

    • Alessio Lomuscio | Imperial College London

 

  • Title and abstract following soon
    • John Tylko | Aurora Flight Sciences

 

  • Deep Generative Models for Configuration Design. Abstract

    At the simplest level the design process brings user requirements to a notion of form and function. However, the design process is filled with endless possibilities. It’s a daunting task to human designers, which often leads to hasty decisions. Optimization is tantalizing at this stage. However, premature optimization can exacerbate these hasty decisions. Introducing generative models into the design process can allow designers to further explore the design space prior to optimization and make better decisions. In this talk, I will explain the configuration design process, show how to use AI for configuration design, and show results from various domains.

    • Emilio Botero | Stanford University

 

  • Evaluating shot decision-making in soccer. Abstract

    Machine learning has had a big impact on how soccer is being played and evaluated. Perhaps the most prominent example is the Expected Goals (xG) metric, which measures the quality of shots by calculating the likelihood of each shot resulting in a goal. Off the pitch, this metric is reported in the media, mentioned by managers, and integrated into video games. On the pitch this metric has driven a change in shooting behavior over the years: the total number of shots has decreased, and teams are passing up on long-distance shots, which are of lower quality, in the hopes of generating a higher quality shot closer to goal later on. This change in behavior might however not be ideal as the reward teams might get from a higher quality shot may not outweigh the risk they are taking of losing possession along the way. Yet assessing such risk-reward trade-off situations is quite challenging and addressing this requires a combination of learning and reasoning. First, one must learn a model of the team’s behavior from in-game data, which already poses various challenges. We propose to model the behavior of teams using a Markov Decision Process where the probabilities are learned using a combination of predictive modeling and a hierarchical Bayesian approach. Second, one needs to reason about the modeled behavior in order to assess the possible outcomes of alternative, observed decisions. We propose to use probabilistic verification as well as analytical techniques to reason about both the modeled and counterfactual behavior of teams. Using this framework, our key conclusion regarding long-distance shots is that teams are indeed overcompensating: teams would score more goals if they shot more often from outside the penalty box in a small number of team-specific locations.

    • Maike Van Roy | KU Leuven

 

  • Edge-augmented Graph Transformers: Global Self-attention as a Replacement for Graph Convolution. Abstract

    I’ll talk about our recent work on the Edge-augmented Graph Transformer
    (EGT) model for general-purpose graph learning by adding a dedicated pathway
    for pairwise structural information, called edge channels. EGT can directly
    accept, process and output structural information of arbitrary form, which
    is important for effective learning on graph-structured data. Our model
    exclusively uses global self-attention as an aggregation mechanism rather
    than static localized convolutional aggregation. This allows for
    unconstrained long-range dynamic interactions between nodes. Moreover, the
    edge channels allow the structural information to evolve from layer to
    layer, and prediction tasks on edges/links can be performed directly from
    the output embeddings of these channels. We verify the performance of EGT in
    a wide range of graph-learning experiments on benchmark datasets, in which
    it outperforms Convolutional/Message-Passing Graph Neural Networks. EGT set
    a new state-of-the-art for the quantum-chemical regression task on the
    OGB-LSC PCQM4Mv2 dataset containing 3.8 million molecular graphs. Our
    findings indicate that global self-attention based aggregation can serve as
    a flexible, adaptive and effective replacement of graph convolution for
    general-purpose graph learning.

    • Mohammad Zaki | Rensselaer Polytechnic Institute

 

  • Title and abstract following soon
    • Elias Bareinboim | Columbia University

 

  • Title and abstract following soon
    • Sungkweon Hong | MIT

 

  • Title and abstract following soon
      • Jonathan How | MIT

 

  •   Learning for multi-robot path finding Abstract

    Using a group of robots in place of a single robot to accomplish a complex task has many benefits such as redundancy, robustness, faster completion times, and the ability to be everywhere at once. The applications of such systems are wide and varied: Imagine teams of robots containing forest fires, filling urban skies with package deliveries, or searching for survivors after a natural disaster. These applications have been motivating multirobot research for years, but why aren’t they happening yet? These missions require solutions that can manage complexity as the number of robots increases, sharing what amounts to a shrinking space.

    Finding collision-free paths for robots in shared environments is called the multi-agent path finding problem (MAPF), which is known to be NP-hard. In this talk I will present strategies my research group has developed that use machine learning tools to understand what makes some MAPF instances hard, and how we can use this information to make informed choices that can save time and energy.

    • Nora Ayanian | Brown University

 

  •   Intrinsic images, equivariance, variance and relighting. Abstract

    An intrinsic image is a map of world properties that is image aligned. In recent years, the term has been used to refer to albedo maps, but originally it applied to many different types of map. I will describe a method to estimate albedo using spatial models, rather like those of retinex. But this method has an equivariance problem – if one estimates albedo for two distinct crops of an image, the albedo estimates in overlapping regions do not agree. This is unacceptable behavior. I show that an averaging procedure will fix this failure. The result is a method that is the strongest current unsupervised method using current evaluation procedures. But there is a problem – the method reports different albedos for different lightings of the same scene. This is a more interesting failure of equivariance. I show that this problem can largely be cured by simulating lightings of the scene, and averaging over simulated lightings. I show that crop equivariance failures occur in very strong current normal and depth estimation procedures, as do relighting equivariance failures. Finally, I show that averaging over relightings improves the behavior of very strong current depth and normal estimators. All this points to a key question: How can we easily relight images of scenes without requiring inverse graphics datasets? I describe some progress to resolving that question.

    • David Forsyth | University of Illinois, Urbana-Champaign

 

  • Learning Structure from Data. Abstract

    While deep learning has achieved a huge success across different disciplines, training such models is known to require significant amounts of data. One possible reason is that structural properties of the data and problem are not modeled explicitly. Effectively exploiting the structure can help build more efficient and performing models. The complexity of the structure requires models with enough representation capabilities. However, increased structured model complexity usually leads to increased inference complexity. In this talk, I will present two inference techniques for more expressive structured models, i.e., models with inference procedures that can handle complex dependencies between variables efficiently. In a first part I will discuss extending Gaussian conditional random fields, traditionally unimodal and only capturing pairwise variables interactions, to model multi-modal distributions with high-order dependencies between the output space variables, while enabling exact inference and incorporating external constraints at runtime, for the task of diverse gray-image colorization. In the second part of the talk, I will present a reinforcement learning-based method for solving inference in models with general higher-order potentials, that are intractable with traditional techniques, for the task of semantic segmentation.

    • Safa Messaoud | Qatar Computing Research Institute

 

  • Pushing the limits of small object detection. Abstract

    Detecting small objects is challenging and the challenge grows bigger as the objects grow smaller. Below a certain resolution objects become effectively undetectable with standard object detectors. We are interested in pushing the limits of detectability for the smallest objects.
    When spatial resolution is low, sometimes temporal information is useful. If the camera moves, to benefit from temporal change, some alignment process is needed. We need sub-pixel alignment which has its own challenges. In this talk I will take you through the challenges and solutions towards detecting the smallest objects.

    • Amin Sadeghi | Qatar Computing Research Institute

  • SPEAKERS

    Nora Ayanian

    Professor, Brown University

    BIO

    Elias Bareinboim

    Elias Bareinboim 

    Professor, Columbia University

    BIO

     

    david forsyth

    David Forsyth

    Professor, University of Illinois, Urbana-Champaign

     

    BIO

     

    Jonathan How

    Professor, MIT

     

    alessio_lomuscio

    Alessio Lomuscio

    Professor, Imperial College

     

    BIO

    Alessio Lomuscio is Professor of Safe Artificial Intelligence in the Department of Computing at Imperial College London (UK), where he leads the Verification of Autonomous Systems Lab . He serves as Deputy Director for the UKRI Doctoral Training Centre in Safe and Trusted Artificial Intelligence. He is a Distinguished ACM member, a Fellow of the European Association of Artificial Intelligence and currently holds a Royal Academy of Engineering Chair in Emerging Technologies. His research interests concern the development of formal verification methods for artificial intelligence systems. Since 2000 he has worked on the development of formal methods for the verification of autonomous systems and multi-agent systems. To this end, he has put forward several methods based on model checking and various forms abstraction to verify AI systems, including robotic swarms, against AI-based specifications. He is presently focusing on the development of methods to ascertain the correctness of AI systems based on deep neural networks. He has published approximately 200 papers in AI conferences (including IJCAI, KR, AAAI, AAMAS and ECAI), verification and formal methods conferences (CAV, SEFM, ATVA), and international journals (AIJ, JAIR, ACM ToCL, JAAMAS, Information and Computation). He sits on the Editorial Board member for AIJ, JAIR, and JAAMAS, and has recently served as general co-chair for AAMAS 2021.

     

    martin rinard

    Martin Rinard

    Professor, MIT

    BIO

    Amin Sadeghi

    Principal Scientist, QCRI

     

    BIO

    Amin Sadeghi is a Scientist at Qatar Computing Research Institute. His background is in Computer Vision and Machine Learning. Prior to joining QCRI, he served as an Assistant Professor at the University of Tehran. Prior to that he studied Computer Vision at UIUC under David Forsyth. Besides academia Amin has served as a cofounder and the CTO of a few successful ML-related companies.

     

    kilian Weinberger

    Kilian Weinberger

    Professor, Cornell University

     

    BIO

    Kilian Weinberger is a Professor in the Department of Computer Science at Cornell University. He received his Ph.D. from the University of Pennsylvania in Machine Learning under the supervision of Lawrence Saul and his undergraduate degree in Mathematics and Computing from the University of Oxford. During his career he has won several best paper awards at ICML (2004), CVPR (2004, 2017), AISTATS (2005) and KDD (2014, runner-up award). In 2011 he was awarded the Outstanding AAAI Senior Program Chair Award and in 2012 he received an NSF CAREER award. He was elected co-Program Chair for ICML 2016 and for AAAI 2018 and became president elect of the ICML society in 2021. In 2016 he was the recipient of the Daniel M Lazar ’29 Excellence in Teaching Award and in 2021 became a Blavatnik National Awards Finalists. Kilian Weinberger’s research focuses on Machine Learning and its applications. In particular, he focuses on learning under resource constraints, metric learning, Gaussian Processes, computer vision and deep learning. Before joining Cornell University, he was an Associate Professor at Washington University in St. Louis and before that he worked as a research scientist at Yahoo! Research in Santa Clara.

     

    john tylko

    John Tylko

    CIO and Co-founder, Aurora Flight Sciences

    BIO

     

    Mohammed Zaki

    Professor & Chair, Rensselaer Polytechnic Institute

     

    BIO

    Mohammed J. Zaki is a Professor and Department Head of Computer Science at RPI. He received his Ph.D. degree in computer science from the University of Rochester in 1998. His research interests focus on developing novel data mining and machine learning techniques, especially for applications in text mining, bioinformatics and personal health. He has around 300 publications (and 6 patents), including the Data Mining and Machine Learning textbook (2nd Edition, Cambridge University Press, 2020). He has co-chaired all of the major conferences in data mining, and he is co-chairing CIKM’22. He is currently serving on the Board of Directors for ACM SIGKDD. He was a recipient of the National Science Foundation CAREER Award and the Department of Energy Early Career Principal Investigator Award, as well as HP Innovation Research Award, and Google Faculty Research Award. His research is supported in part by NSF, DARPA, NIH, DOE, IBM, Google, HP, and Nvidia. He is a Fellow of the IEEE and a Fellow of the ACM.

     

    Emilio Botero

    Scientist, Stanford University

     

    BIO

    Emilio Botero is a Postdoctoral Researcher in the Aerospace Design Lab at Stanford University. He obtained his Master’s and PhD from Stanford and his BS from Embry-Riddle Aeronautical University.

     

    Safa Messaoud

    Scientist, QCRI

    BIO

    Safa is a research scientist at Qatar Computing Research institute (QCRI). Her research is centred around reinforcement learning and computer vision. She is particularly interested in safe reinforcement learning and AI for social good. She has papers published in CVPR, ECCV and ACM SIGIR. Safa graduated with a PhD in Electrical and Computer Engineering from the University of Illinois at Urbana-Champaign under the supervision of Prof. Alexander Schwing. In her PhD thesis, she developed algorithms for more scalable learning and inference in energy based models with applications in computer vision. Safa obtained a Master degree in Computer Engineering from Virginia Tech working with Prof. Sandeep Shukla. She also has Bachelor and Master degrees in Electrical Engineering and Information Technology from the Technical University of Munich (TUM) and conducted her Master thesis research at UC-Berkeley, under the supervision of Prof. Alberto Sangiovanni Vincentelli and Prof. Andreas Herkersdorf.

     

    Sungkweon Hong

    MIT

    Maaike Van Roy

    KU Leuven, Belgium

    BIO

    Maaike is a final year PhD student in the DTAI Sports Analytics Lab at KU Leuven in Belgium. Her research focuses on the intersection of AI and sports where she applies learning and reasoning techniques to analyze data arising from professional football matches. In particular, she is interested in obtaining insights for providing tactical advice and aiding the in-game decision-making of teams. Her research has attracted attention from practitioners at professional clubs and has been covered by fivethirtyeight.com, the New York Times, and ESPN.com. She has recently been awarded the Routledge Young Researcher Award at the 2022 IACSS ISPAS Joint Congress. Outside of research, she can often be found on the pitch herself.