Skip to yearly menu bar Skip to main content


Timezone: America/Los_Angeles

Tutorial: Toshiyuki Ohtsuka

Real-Time Optimization for Fast and Complex Control Systems

Control systems are mechanisms that enable realization of desirable behaviors from dynamical systems, such as automobiles, robots, and manufacturing processes; although invisible, they are often essential for our daily lives. Control engineering involves the analysis and design of control systems, and optimal control is one of the important problems in control engineering. In an optimal control problem, the control input is determined to minimize a cost function given certain constraints. Even if a mathematical model of the control system is known, it is generally difficult to find its optimal control input owing to heavy computations or data storage, and the development of efficient algorithms for optimal control problems has been an active area of research for several decades. Realization of optimal control for dynamical systems by adaptation or learning is challenging when their mathematical models are unknown; moreover, developing practical optimal control methods for unknown dynamical systems is a challenge both in control engineering and machine learning. Therefore, control systems provide ample motivation and opportunity for machine learning. This tutorial aims to help researchers and engineers in the field of machine learning tackle problems in control systems. An overview of the problems and concepts in control engineering is provided first, and the specific benefits of control methods without learning are outlined; the primary focus here is on model predictive control (MPC) based on real-time optimization, which has rapidly developed in recent years. MPC can address various control problems beyond traditional control objectives, such as regulation and tracking, and is applicable to a wide class of dynamical systems if real-time optimization is feasible. Typical applications of MPC include mechanical systems based on detailed nonlinear models, such as drones, automobiles, and robots, with sampling periods of the order of milliseconds. Moreover, MPC enables optimal control performance and is often used as a reference for learning-based control methods. Against the backdrop of these current achievements, a discussion on the ideas and methodologies of control engineering, which can prove beneficial to machine learning, will be carried out.

Toshiyuki Ohtsuka received his B.Eng., M.Eng, and D.Eng. from Tokyo Metropolitan Institute of Technology, Japan, in 1990, 1992 and 1995. From 1995 to 1999, he worked as an Assistant Professor at the University of Tsukuba. In 1999, he joined Osaka University, and he was a Professor at the Graduate School of Engineering Science from 2007 to 2013. In 2013, he joined Kyoto University as a Professor at the Graduate School of Informatics. His research interests include nonlinear control theory and real-time optimization with applications to aerospace engineering and mechanical engineering. He is a member of SICE, ISCIE, and JSASS and a senior member of IEEE and AIAA.



Tutorial: Irina Higgins · Antonia Creswell · Sébastien Racanière

Pay Attention to What You Need: Do Structural Priors Still Matter in the Age of Billion Parameter Models?

The last few years have seen the emergence of billion parameter models trained on 'infinite' data that achieve impressive performance on many tasks, suggesting that big data and big models may be all we need. But how far can this approach take us, in particular on domains where data is more limited? In many situations adding structured architectural priors to models may be key to achieving faster learning, better generalisation and learning from less data. Structure can be added at the level of perception and at the level of reasoning - the goal of GOFAI research. In this tutorial we will use the idea of symmetries and symbolic reasoning as an overarching theoretical framework to describe many of the common structural priors that have been successful in the past for building more data efficient and generalisable perceptual models, and models that support better reasoning in neuro-symbolic approaches.

Irina Higgins is a Staff Research Scientist at DeepMind, where she works in the Froniers team. Her work aims to bring together insights from the fields of neuroscience and physics to advance general artificial intelligence through improved representation learning. Before joining DeepMind, Irina was a British Psychological Society Undergraduate Award winner for her achievements as an undergraduate student in Experimental Psychology at Westminster University, followed by a DPhil at the Oxford Centre for Computational Neuroscience and Artificial Intelligence, where she focused on understanding the computational principles underlying speech processing in the auditory brain. During her DPhil, Irina also worked on developing poker AI, applying machine learning in the finance sector, and working on speech recognition at Google Research.
Antonia Creswell is a Senior Research Scientist at DeepMind in the Cognition team. Her work focuses on the learning and integration of object representations in dynamic models. She completed her PhD on representation learning at Imperial College London in the department of Bioengineering.
Sébastien Racanière is a Staff Research Engineer in DeepMind. His current interests in ML revolve around the interaction between Physics and Machine Learning, with an emphasis on the use of symmetries. He got his PhD in pure mathematics from the Université Louis Pasteur, Strasbourg, in 2002, with co-supervisors Michèle Audin (Strasbourg) and Frances Kirwan (Oxford). This was followed by a two years Marie-Curie Individual Fellowship in Imperial College, London, and another postdoc in Cambridge (UK). His first job in the industry was at the Samsung European Research Institute, investigating the use of Learning Algorithms in mobile phones, followed by UGS, a Cambridge based company, working on a 3D search engine. He afterwards worked for Maxeler, in London, programming FPGAs. He then moved to Google, and finally DeepMind.



Tutorial: Maria Schuld · Juan Carrasquilla

Machine Learning With Quantum Computers

Quantum computing, a discipline that investigates how computing changes if we take into account quantum effects, has turned into an emerging technology that produced the first generation of hardware prototypes. In search of applications for these new devices, researchers turned to machine learning and found a wealth of exciting questions: Do machine learning algorithms gain a better computational complexity if we outsource parts of them to quantum computers? How does the problem of empirical risk minimisation change if our model class is made up of quantum algorithms? How does quantum hardware fit into AI pipelines? And, vice versa, can machine learning help us to study the behaviour of quantum systems?

In this tutorial we want to unpack these questions and sketch the landscape of preliminary answers found so far. For example, we will look at carefully constructed learning problems for which quantum computers have a provable complexity advantage, and motivate why it is so hard to make conclusive statements about more natural problem settings. We will explore how data can be represented as physical states of quantum systems, and how manipulating these systems leads to algorithms that are just kernel methods with a special kind of Hilbert space. We will see that quantum devices can be trained like neural networks, and that existing open-source software seamlessly integrates them into deep learning pipelines. Finally, we will understand how the deep connections between neural networks and quantum wave functions allow us to use machine learning techniques to understand quantum systems themselves.

The tutorial targets a broad audience, and no prior knowledge of physics is required.

Maria Schuld works as a senior researcher for the Toronto-based quantum computing startup Xanadu, as well as for the Big Data and Informatics Flagship of the University of KwaZulu-Natal in Durban, South Africa, from which she received her PhD in 2017. She co-authored the book "Supervised Learning with Quantum Computers" (Springer 2018) and is a lead developer of the PennyLane software framework for quantum differentiable programming. Besides her pioneering research on the intersection of quantum computing and machine learning, Maria has a postgraduate degree in political science, and a keen interest in the interplay between data, emerging technologies and society.
Juan Carrasquilla is a full-time researcher at the Vector Institute for Artificial Intelligence in Toronto, Canada, where he works on the intersection of condensed matter physics, quantum computing, and machine learning - such as combining quantum Monte Carlo simulations and machine learning techniques to analyze the collective behaviour of quantum many-body systems. He completed his PhD in Physics at the International School for Advanced Studies in Italy and has since held positions as a Postdoctoral Fellow at Georgetown University and the Perimeter Institute, as a Visiting Research Scholar at Penn State University, and was a Research Scientist at D-Wave Systems Inc. in Burnaby, British Columbia.



Tutorial: Vukosi Marivate · David Adelani

A Journey Through the Opportunity of Low Resourced Natural Language Processing — An African Lens

Low Resourced languages pose an interesting challenge for Machine Learning algorithms, representation, data collection and accessibility of machine learning in general. In this tutorial, we work to provide a journey through machine learning in low resourced languages that covers a breadth of sub topics and depth in some of the areas of focus. We will do this through the lens of Natural Language processing for African languages. We present some historical context, recent advances and current opportunities that researchers can take advantage of to do impactful research in this area. We hope for this tutorial to not only shed light on the subject area, but to expand the number of practitioners who interact in a thoughtful and considerate way with the wider ML community working in these areas. We hope this to be as interactive as possible and to provide resources for researchers to tackle the challenges.

Vukosi Marivate (https://www.vima.co.za/) holds a PhD in Computer Science (Rutgers University, as Fulbright Science and Technology Fellow) and MSc & BSc in Electrical Engineering (Wits University). Dr Marivate is based at the University of Pretoria as the UP ABSA Chair of Data Science. He works on developing Machine Learning/Artificial Intelligence methods to extract insights from data. A large part of his work over the last few years has been in the intersection of Machine Learning and Natural Language Processing (NLP). This has led to research outputs focused on how we can better improve low resource language tools, especially for African Languages. This has included creating new software libraries, new research approaches for robust NLP and encouraging the development of datasets for African languages. As part of his vision for the ABSA Data Science chair, Vukosi is interested in Data Science for Social Impact (https://dsfsi.github.io/), using local challenges as a springboard for research. In this area, Vukosi has worked on projects in science, energy, public safety and utilities. Vukosi is cofounder of the Deep Learning Indaba, the largest Machine Learning/Artificial Intelligence workshop on the African continent, aiming to strengthen African Machine Learning
David Adelani is a doctoral student in computer science at Saarland University, Saarbrücken, Germany. His current research focuses on the security and privacy of users’ information in dialogue systems and online social interactions. Originally from Nigeria, he is also actively involved in the development of natural language processing datasets and tools for low-resource languages, with a special focus on African languages.



Tutorial: César Lincoln Mattos · Felipe Tobar

The Art of Gaussian Processes: Classical and Contemporary

Gaussian processes (GP) are Bayesian nonparametric models for continuous functions which allow for uncertainty quantification, interpretability, and the incorporation of expert knowledge. The theory and practice of GPs have flourished in the last decade, where researchers have looked into the expressiveness and efficiency of GP-based models and practitioners have applied them to a plethora of disciplines. This tutorial presents both the foundational theory and modern developments of data modelling using GPs, following step by step intuitions, illustrations and real-world examples. The tutorial will start with emphasis on the building blocks of the GP model, to then move onto the choice of the kernel function, cost-effective training strategies and non-Gaussian extensions. The second part of the tutorial will showcase more recent advances, such as latent variable models, deep GPs, current trends on kernel design and connections between GPs and deep neural networks. We hope that this exhibition, featuring classic and contemporary GP works, inspires attendees to incorporate GPs into their applications and motivates them to continue learning and contributing to the current developments in the field.

César Lincoln Cavalcante Mattos is an associate professor at the Department of Computer Science, at Federal University of Ceará (UFC), Brazil. He is also an associate researcher at the Logics and Artificial Intelligence Group (LOGIA). He has research interests in the broad fields of machine learning and probabilistic modeling, such as Gaussian processes, deep (probabilistic) learning, approximate inference and system identification. He has been applying learning methods in several research and development collaborations in areas such as dynamical system modeling, health risk analysis, software repository mining and anomaly detection.
Felipe Tobar is an Assistant Professor at the Data & AI Initiative at Universidad de Chile. He holds Researcher positions at the Center for Mathematical Modeling and the Advanced Center for Electrical Engineering. Felipe received the BSc/MSc degrees in Electrical Engineering (U. de Chile, 2010) and a PhD in Signal Processing (Imperial College London, 2014), and he was an Associate Researcher in Machine Learning at the University of Cambridge (2014-2015). Felipe teaches Statistics and Machine Learning courses at undergraduate, graduate and professional levels. His research interests lie in the interface between Machine Learning and Statistical Signal Processing, including Gaussian processes, spectral estimation, approximate inference, Bayesian nonparametrics, and optimal transport.



Tutorial: Shirley Ho · Miles Cranmer

ML for Physics and Physics for ML

Physics research and deep learning have a symbiotic relationship, and this bond has become stronger over the past several years. In this tutorial, we will present both sides of this story. How has deep learning benefited from concepts in physics and other sciences? How have different subfields of physics research capitalized on deep learning? What are some yet-unexplored applications of deep learning to physics which could strongly benefit from machine learning? We will discuss the past and present of this intersection, and then theorize possible directions for the future of this connection. In the second part of this talk, we will outline some existing deep learning techniques which have exploited ideas from physics, and point out some intriguing new directions in this area.

Shirley Ho is a group leader at Flatiron Institute at Simons foundation, a research professor of physics, and an affiliated faculty at Center for Data Science at New York University. She is currently most excited about building generalist AI models for science with her team members at Polymathic AI initiative (https://polymathic-ai.org/)!
Assistant Professor at University of Cambridge. Physics + AI.



Tutorial: Timnit Gebru · Emily Denton

Beyond Fairness in Machine Learning

The machine learning community is seeing an increased focus on fairness-oriented methods of model and dataset development. However, much of this work is constrained by a purely technical understanding of fairness -- an understanding that has come to mean parity of model performance across sociodemographic groups -- that offers a narrow way of understanding how machine learning technologies intersect with systems of oppression that structure their development and use in the real world. In contrast to this approach, we believe it is essential to approach machine learning technologies from a sociotechnical lens, examining how marginalized communities are excluded from their development and impacted by their deployment. Our tutorial will center the perspectives and stories of communities who have been harmed by machine learning technologies and the dominant logics operative within this field. We believe it is important to host these conversations from within the NeurIPS venue so that researchers and practitioners within the machine learning field can engage with these perspectives and understand the lived realities of marginalized communities impacted by the outputs of the field. In doing so, we hope to shift the focus away from singular technical understandings of fairness and towards justice, equity, and accountability. We believe this is a critical moment for machine learning practitioners and for the field as a whole to come together and reimagine what this field might look like. We have great faith in the machine learning community and hope that our tutorial will foster the difficult conversations and meaningful reflection upon the state of the field that is essential to begin constructing a different mode of operating. Our tutorial will highlight research on uncovering and mitigating issues of unfair bias and historical discrimination that machine learning systems learn to mimic and propagate. We will also highlight the lived realities of marginalized communities impacted by machine learning technologies. We will provide tutorial participants with tools and frameworks to incorporate into their own research practice that will facilitate socially aware work and help mitigate harmful impacts of their research.

Timnit Gebru was recently fired by Google for raising issues of discrimination in the workplace. Prior to that she was a co-lead of the Ethical AI research team at Google Brain. She received her PhD from the Stanford Artificial Intelligence Laboratory, studying computer vision under Fei-Fei Li, and did a postdoc at Microsoft Research, New York City in the FATE (Fairness Accountability Transparency and Ethics in AI) group, where she studied algorithmic bias and the ethical implications underlying projects aiming to gain insights from data. Timnit also co-founded Black in AI, a nonprofit that works to increase the presence, inclusion, visibility and health of Black people in the field of AI.
Emily Denton is a Research Scientist at Google where they examine the societal impacts of AI technology. Their recent research centers on critically examining the norms, values, and work practices that structure the development and use of machine learning datasets. Prior to joining Google, Emily received their PhD in machine learning from the Courant Institute of Mathematical Sciences at New York University, where they focused on unsupervised learning and generative modeling of images and video.



Tutorial: Karen A McKinnon · Andrew N Poppick

Machine Learning and Statistics for Climate Science

The assessment of climate variability and change is enriched by novel applications of statistics and machine learning methodologies. This tutorial will be an introduction to some of the common statistical and machine learning problems that arise in climate science. The goal is to give attendees a sense of the intersections between the fields and to help promote future interdisciplinary collaborations. We will introduce you to different climate data sources (e.g., in situ measurements, satellite data, climate model data, etc.) and discuss problems including: characterizing changes in extreme events like heatwaves or extreme precipitation, summarizing high-dimensional spatiotemporal climate data, and using statistical methods to predict climate variability and potentially improve future projections. The focus will be on methodological applications; we will discuss both core methodologies and recent innovations. Prior knowledge of climate science is not assumed and we will emphasize the value of engaging substantively with domain experts.

Karen McKinnon is an Assistant Professor of Statistics and the Environment at UCLA. She received her PhD in Earth and Planetary Sciences at Harvard University in 2015. Before joining UCLA in 2018, she was an Advanced Study Postdoc at the National Center for Atmospheric Research and an Applied Scientist at Descartes Labs. Her research focuses on large-scale climate variability and change, with a particular interest in extreme events, and draws upon a diverse toolbox that spans climate dynamics, statistics, and machine learning.
Andrew Poppick is an Assistant Professor of Statistics at Carleton College. He received his Ph.D. in Statistics in 2016 from the University of Chicago. His research is primarily in statistical applications to climate, especially characterizing temporal dependence and nonstationarity in climate processes.



Tutorial: Wee Sun Lee

Message Passing In Machine Learning

Message passing algorithms are distributed algorithms that operate on graphs, where each node uses only information present locally at the node and incident edges, and send information only to its neighbouring nodes. They are often highly effective in machine learning and are relatively easy to parallelise. Examples include approximate inference algorithms on probabilistic graphical models, the value iteration algorithm for Markov decision process, graph neural networks and attention networks.

This tutorial presents commonly used approximate inference algorithms for probabilistic graphical models and the value iteration algorithm for Markov decision process, focusing on understanding the objectives that the algorithms are optimising for. We then consider more flexible but less interpretable message passing algorithms including graph neural networks and attention networks. We discuss how these more flexible networks can simulate the more interpretable algorithms, providing some understanding of the inductive biases of these networks through algorithmic alignment and allowing the understanding to be used for network design.

Wee Sun Lee is a professor in the Department of Computer Science, National University of Singapore. He obtained his B.Eng from the University of Queensland in 1992 and his Ph.D. from the Australian National University in 1996. He has been a research fellow at the Australian Defence Force Academy, a fellow of the Singapore-MIT Alliance, and a visiting scientist at MIT. His research interests include machine learning, planning under uncertainty, and approximate inference. His works have won the Test of Time Award at Robotics: Science and Systems (RSS) 2021, the RoboCup Best Paper Award at International Conference on Intelligent Robots and Systems (IROS) 2015, the Google Best Student Paper Award, Uncertainty in AI (UAI) 2014 (as faculty co-author), as well as several competitions and challenges. He has been an area chair for machine learning and AI conferences such as the Neural Information Processing Systems (NeurIPS), the International Conference on Machine Learning (ICML), the AAAI Conference on Artificial Intelligence (AAAI), and the International Joint Conference on Artificial Intelligence (IJCAI). He was a program, conference and journal track co-chair for the Asian Conference on Machine Learning (ACML), and he is currently the co-chair of the steering committee of ACML.



Tutorial: Lilian Weng · Jong Wook Kim

Self-Supervised Learning: Self-Prediction and Contrastive Learning

Self-supervised learning is a great way to extract training signals from massive amounts of unlabelled data and to learn good representation to facilitate downstream tasks where it is expensive to collect task-specific labels. This tutorial will focus on two major approaches for self-supervised learning, self-prediction and contrastive learning. Self-prediction refers to self-supervised training tasks where the model learns to predict a portion of the available data from the rest. Contrastive learning is to learn a representation space in which similar data samples stay close to each other while dissimilar ones are far apart, by constructing similar and dissimilar pairs from the dataset. This tutorial will cover methods on both topics and across various applications including vision, language, video, multimodal, and reinforcement learning.

Lilian Weng is working at OpenAI over a variety of research and applied projects. In the Robotics team, she worked on several challenging robotic manipulation tasks, including solving a fully scrambled Rubik's cube with a single robot hand, via deep reinforcement learning and sim2real transfer techniques. Currently she leads the Applied AI Research team to use powerful language models to solve real-world applications. Her research interests are quite broad, as she writes about various topics in deep learning in her highly viewed ML blog https://lilianweng.github.io/lil-log/.
Jong Wook Kim is a member of technical staff at OpenAI, where he worked on GPT-2 output detection, Jukebox, and CLIP. His research interests include representation learning and generative modeling of audio and music, as well as its applications to multimodal deep learning. Prior to OpenAI, he completed a Ph.D. in music technology from NYU, which focused on automatic music transcription. He also worked as a research scientist intern at Pandora and Spotify, and as a software engineer at Kakao and NCSOFT.



Joint Affinity Poster Session Mon 6 Dec 09:00 p.m.  

Join the GatherTown link to visit the Joint Poster Session.


Panel: The Consequences of Massive Scaling in Machine Learning Mon 6 Dec 11:00 p.m.  

Noah Goodman · Melanie Mitchell · Joelle Pineau · Oriol Vinyals · Jared Kaplan

Machine learning research has always prized algorithmic contributions. However, many recent big breakthroughs have been driven by scaling up the same basic algorithms and architectures. The most recent example is OpenAI’s massive language model GPT-3, which won a best paper award at NeurIPS in 2020. GPT-3 was based on the same Transformer architecture as its predecessors, but when scaled up, it resulted in remarkable unexpected behaviors, which had a massive impact on the way we think about language models. As more progress becomes driven by scaling, how should we adapt as a community? Should it affect what problems are considered interesting? Should publication norms take scale into account, or de-emphasize algorithmic contributions? How do we ensure that smaller institutions or academic labs can meaningfully research and audit large-scale systems? From a safety perspective, if behaviors appear emergently at scale, how can we ensure that systems behave as intended? In this panel, we will explore these critical questions so that the NeurIPS community at large can continue to make fundamental advances in the era of massive scaling.