NeurIPS 2020 Monday 12/7

Timezone: America/Vancouver

Schedule Mon Tue Wed Thu Fri Sat

Tutorial: Pascale N Fung · Yun-Nung (Vivian) Chen · Zhaojiang Lin · Andrea Madotto

Conversational AI systems interact with human users while completing user requests or simply chit-chat. These systems have applications ranging from personal assistance, health assistance to customer services, etc. In this three-part tutorial, we will first give an overview of the state-of-the-art modularized conversational AI approaches that are commonly adopted by task-oriented dialog systems. We will then give an overview of the current sequence to sequence , generation-based conversational AI approaches. We will discuss the challenges and shortcomings of vanilla generation-based models such as the lack of knowledge, consistency, empathy, controllability, versatility, etc. We will then highlight current work in addressing these challenges and in improving the depth of generation-based ConvAI. In the final part of the tutorial we will point out remaining challenges of conversational AI and possible directions for future research, including how to mitigate inappropriate responses and lifelong learning. We will also present an overview of shared tasks and publicly available resources for both modularized and generation-based conversational AI.

Bio s:

Pascale N Fung

Pascale Fung (馮雁) (born 1966 in Shanghai, China) is a professor in the Department of Electronic & Computer Engineering and the Department of Computer Science & Engineering at the Hong Kong University of Science & Technology(HKUST). She is the director of the newly established, multidisciplinary Centre for AI Research (CAiRE) at HKUST. She is an elected Fellow of the Institute of Electrical and Electronics Engineers (IEEE) for her “contributions to human-machine interactions”[1] and an elected Fellow of the International Speech Communication Association for “fundamental contributions to the interdisciplinary area of spoken language human-machine interactions”.

Yun-Nung (Vivian) Chen is currently an associate professor in the Department of Computer Science & Information Engineering at National Taiwan University. She earned her Ph.D. degree from Carnegie Mellon University, where her research interests focus on spoken dialogue systems, language understanding, natural language processing, and multimodality. She received Google Faculty Research Awards, Amazon AWS Machine Learning Research Awards, MOST Young Scholar Fellowship, and FAOS Young Scholar Innovation Award for her research contributions. Prior to joining National Taiwan University, she worked in the Deep Learning Technology Center at Microsoft Research Redmond. (http://vivianchen.idv.tw/)

Zhaojiang Lin is a Ph.D. candidate in Electronic and Computer Engineering at The Hong Kong University of Science and Technology and Centre for Artificial Intelligence Research (CAiRE). He completed his Bachelor in Electronic Engineering at University of Electronic Science and Technology of China. His research interests lie in the area of the Dialogue System, Meta-learning, Affective computing, Natural Language Understanding, and Multilinguality. He received Best Paper Awards from RepL4NLP@ACL 2019 and ConvAI@NeurIPS 2019. He serves as the Program Committee for several major machine learning & natural language processing conferences: NeurIPS, ICLR, AAAI, and NAACL.

Andrea Madotto is a PhD candidate in Electronic & Computer Engineering at The Hong Kong University of Science and Technology and part of the Centre for Artificial Intelligence Research (CAiRE). His research focuses on conversational modelling, controllable language generation, and meta/continual learning. He received the Outstanding Paper Award from ACL2019 and the best paper award from the ConvAI workshop at NeurIPS2019, and his work has been featured in MIT technology review and VentureBeat. He serves as program committee and reviewer for various machine learning and natural language processing conferences such as ACL, EMNLP, NeurIPS, ICLR, and AAAI, and journals such as Journal of Natural Language Engineering and Computer Speech and Languages.

Expo Demonstration: GAN Applications in Fashion Article Design and Outfit Rendering Mon 7 Dec 12:00 a.m.

Nana Yamazaki · Gökhan Yildirim · Nikolay Jetchev

Advances in deep learning enabled sampling realistic images via generative modeling. This leads to new avenues in visual design and content creation, e.g. in fashion, where visualization is a key component. GANs can be used to create personalized visual content - e.g. rendering an outfit on a human body and creating unique designs - which can enrich shopping experience on e-commerce platforms. We will demo two projects, where we used GANs to create fashion images and enable novel applications:

Fashion Outfit Renderer

We work on generating high-resolution images of fashion models wearing desired outfits and standing in different poses. At Zalando, we provide quality photographs of fashion models wearing the articles in our online selection. These photographs help customers visualise the garments they browse and enhance the shopping experience. But what if our customers wish to visualise an individually created outfit? Zalando has a large and evolving assortment of garments, which makes it infeasible to photograph all outfit combinations. To solve this challenge, we work on a “Fashion Renderer”, which creates a computer-generated image of a fashion model wearing an input outfit for an input body pose.

Generative fashion design and search

Fashion customers often have a visual idea of what they would like to buy. However, finding the right article can be a time-consuming process, as people need to convert their visual ideas into accurate linguistic search terms, and search engines should correctly interpret customers’ search queries and retrieve relevant results. We enable search in a visual-only space by allowing customers to generate and breed different dress designs with using a style-based GAN. Created designs are used as a visual query to retrieve existing dresses in real time. This approach attempts to eliminate representation and interpretation problems in the word-based search and provides a novel way for searching fashion items.

Tutorial: Jelani Nelson

(Track1) Sketching and Streaming Algorithms

A “sketch” is a data structure supporting some pre-specified set of queries and updates to a database while consuming space substantially (often exponentially) less than the information theoretic minimum required to store everything seen, and thus can also be seen as some form of functional compression. The advantages of sketching include less memory consumption, faster algorithms, and reduced bandwidth requirements in distributed computing environments. A "streaming" algorithm is one that dynamically updates a sketch as data is updated. In this tutorial we sketch (pun intended) a suite of tools from the sketching literature for counting problems, graph problems, finding frequent items, dimensionality reduction, and computational linear algebra, together with a discussion of lower bounds.

Bio :

Jelani Nelson

Jelani Nelson is a Professor of Electrical Engineering and Computer Sciences at UC Berkeley, and also a Research Scientist at Google (part-time). He is interested in randomized algorithms, sketching and streaming algorithms, dimensionality reduction, and differential privacy. He is a recipient of the ACM Eugene L. Lawler Award for Humanitarian Contributions within Computer Science, a Presidential Early Career Award for Scientist and Engineers (PECASE), and a Sloan Research Fellowship. He is also Founder and President of AddisCoder, Inc., a nonprofit that provides algorithms training to high school students in Ethiopia and Jamaica.

Tutorial: Risi Kondor · Taco Cohen

(Track2) Equivariant Networks

There is great interest in generalizing deep learning to more exotic types of data, such as graphs, chemical structures, volumetric images, omndirectional images, etc. In each case, the data has nontrivial structure and symmetries and the challenge is to find the right generalization of classical neural network layers like convolution to reflect this. It has become clear that in all of these cases and more, equivariance to symmetry transformations is the key principle that points us to an effective generalization.

New architectures inspired by this principle have already proved their effectiveness in multiple domains. However, some of the underlying ideas are still foreign to much of the community, partly because of the mathematics involved. The purpose of this tutorial is to bridge this gap by giving a very accessible introduction to this emerging area with many practical examples and details of how to implement equivariant architectures in existing deep learning frameworks.

Timetable: Part I (Taco Cohen) 0:00 - Introduction to equivariant networks 39:00 - Examples and applications 51:00 - Equivariant convolutions

Part II (Risi Kondor) 0:00 - Introduction 7:50 - Group Representations 27:35 - Designing equivariant Neurons 45:30 - Fourier theory 56:25 - Implementation

Bio s:

Risi Kondor joined the Flatiron Institute in 2019 as a Senior Research Scientist with the Center for Computational Mathematics. Previously, Kondor was an Associate Professor in the Department of Computer Science, Statistics, and the Computational and Applied Mathematics Initiative at the University of Chicago. His research interests include computational harmonic analysis and machine learning. Kondor holds a Ph.D. in Computer Science from Columbia University, an MS in Knowledge Discovery and Data Mining from Carnegie Mellon University, and a BA in Mathematics from the University of Cambridge. He also holds a diploma in Computational Fluid Dynamics from the Von Karman Institute for Fluid Dynamics and a diploma in Physics from Eötvös Loránd University in Budapest.

Taco Cohen is a machine learning research scientist at Meta / FAIR. He received a PhD in machine learning from the University of Amsterdam advised by prof. Max Welling, where he developed the first equivariant deep networks. He was a co-founder of Scyfer, a company focussed on active deep learning, acquired by Qualcomm in 2017. At Qualcomm he led the generative models and data compression team. His current research is focussed on RL for code generation. He has done internships at Google Deepmind (working with Geoff Hinton) and OpenAI. He received the 2014 University of Amsterdam thesis prize, a Google PhD Fellowship, ICLR 2018 best paper award for “Spherical CNNs”, the Schouhamer-Immink prize for his PhD thesis, and was named one of 35 innovators under 35 in Europe by MIT in 2018.

Tutorial: Marc Deisenroth · Cheng Soon Ong

(Track1) There and Back Again: A Tale of Slopes and Expectations

Integration and differentiation play key roles in machine learning.

We take a tour of some old and new results on methods and algorithms for integration and differentiation, in particular, for calculating expectations and slopes. We review numerical and Monte-Carlo integration for calculating expectations. We discuss the change-of-variables method leading to normalizing flows and discuss inference in time series to get there''. To getback again'', we review gradients for calculating slopes by the chain rule and automatic differentiation, the basis for backpropagation in neural networks. We discuss backpropagation in three settings: in probabilistic graphical models, through an equality constraint, and with an inequality constraint.

To complete the round-trip, we explore algorithms for calculating gradients of expectations, the basis of methods for variational inference, reinforcement learning, and experimental design.

Bio s:

Marc Deisenroth

Professor Marc Deisenroth is the DeepMind Chair in Artificial Intelligence at University College London and the Deputy Director of UCL's Centre for Artificial Intelligence. He also holds a visiting faculty position at the University of Johannesburg and Imperial College London. Marc's research interests center around data-efficient machine learning, probabilistic modeling and autonomous decision making. Marc was Program Chair of EWRL 2012, Workshops Chair of RSS 2013, EXPO-Co-Chair of ICML 2020, and Tutorials Co-Chair of NeurIPS 2021. In 2019, Marc co-organized the Machine Learning Summer School in London. He received Paper Awards at ICRA 2014, ICCAS 2016, and ICML 2020. He is co-author of the book [Mathematics for Machine Learning](https://mml-book.github.io) published by Cambridge University Press (2020).

Cheng Soon Ong is a principal research scientist at the Machine Learning Research Group, Data61, CSIRO, and is the director of the machine learning and artificial intelligence future science platform at CSIRO. He is also an adjunct associate professor at the Australian National University. He is interested in enabling scientific discovery by extending statistical machine learning methods.

Tutorial: Marta Garnelo · David Balduzzi · Wojciech Czarnecki

(Track3) Designing Learning Dynamics

In recent years machine learning research has been dominated by optimisation-based learning methods (take gradient descent, for example, which is ubiquitous in deep learning). However, while tools that operate under this paradigm have proven to be very powerful, they are often not well suited for tackling complex challenges such as highly non-stationary targets or explicit multi-agent systems. In an attempt to overcome such limitations, some researchers are instead turning towards open-ended methods, and considering how to design the underlying learning dynamics. This tutorial discusses how different tools can be applied to construct and combine adaptive objectives for populations of learners. We begin by providing background on the problem setting, basic tools and philosophy. In a second part we then dive into the basics of evolutionary computation. In particular, we frame the development of evolutionary methods as a focus shift away from gradient-free optimisers in search of more generic and powerful tools for designing learning dynamics. Finally, we provide a more detailed overview of techniques and research around training and evaluating populations of agents.

Bio s:

Affinity Workshop: New In ML Mon 7 Dec 03:00 a.m.

Zhen Xu · Vanya Cohen · Shruti Mishra · MingYu Lu

Is this your first time submitting to a top conference? Have you ever wanted your work recognized by a large and active community? Do you want to improve your paper writing, experiments, ideas, etc? Then, this workshop is exactly for you!
Our workshop welcomes contributors new to machine learning research. We have invited top NeurIPS reviewers to review your work and share their experiences with you in poster sessions and mentoring sessions. Our mission is to help you publish papers at next year’s NeurIPS conference, and generally provide you with the guidance you need to contribute to ML research fully and effectively!

Tutorial: Jane Wang · Kevin Miller · Adam Marblestone

(Track1) Where Neuroscience meets AI (And What’s in Store for the Future)

The brain remains the only known example of a truly general-purpose intelligent system. The study of human and animal cognition has revealed key insights, such as the ideas of parallel distributed processing, biological vision, and learning from reward signals, that have heavily influenced the design of artificial learning systems. Many AI researchers continue to look to neuroscience as a source of inspiration and insight. A key difficulty is that neuroscience is a vast and heterogeneous area of study, encompassing a bewildering array of subfields. In this tutorial, we will seek to provide both a broad overview of neuroscience as a whole, as well as a focused look at two areas -- computational cognitive neuroscience and the neuroscience of learning in circuits -- that we believe are particularly relevant for AI researchers today. We will conclude by highlighting several ongoing lines of work that seek to import insights from these areas of neuroscience into AI, and vice versa.

Bio s:

Jane Wang is a research scientist at DeepMind on the neuroscience team, working on meta-reinforcement learning and neuroscience-inspired artificial agents. Her background is in physics, complex systems, and computational and cognitive neuroscience.

Kevin Miller is a research scientist on the Neuroscience Team at DeepMind and a postdoc at University College London. He is currently working on understanding structured reinforcement learning in mice and machines.

Adam Marblestone is a Schmidt Futures innovation fellow, was previously a research scientist at DeepMind, and earlier did a PhD in BioPhysics and worked at a brain computer interface company.

Tutorial: Praveen Chandar · Fernando Diaz · Brian St. Thomas

(Track2) Beyond Accuracy: Grounding Evaluation Metrics for Human-Machine Learning Systems

The evaluation and optimization of machine learning systems have largely adopted well-known performance metrics like accuracy (for classification) or squared error (for regression). While these metrics are reusable across a variety of machine learning tasks, they make strong assumptions often not observed when situated in a broader technical or sociotechnical system. This is especially true in systems that interact with large populations of humans attempting to complete a goal or satisfy a need (e.g. search, recommendation, game-playing). In this tutorial, we will present methods for developing evaluation metrics grounded in what users expect of the system and how they respond to system decisions. The goal of this tutorial is both to share methods for designing user-based quantitative metrics and to motivate new research into optimizing for these more structured metrics.

Bio s:

Praveen Chandar is a Senior Research Scientist at Spotify working on search and recommendations. His research interests are in machine learning, information retrieval, and recommendation systems with a focus on experimentation and evaluation. Praveen received his Ph.D. from the University of Delaware, working on novelty and diversity aspects of search evaluation. He was previously a Research Staff Member at IBM Research. He has published papers at top conferences including, SIGIR, KDD, WSDM, WWW, CIKM, CHI, and UAI.

Fernando Diaz is a research scientist at Google Brain Montréal. His research focuses on the design of information access systems, including search engines, music recommendation services and crisis response platforms is particularly interested in understanding and addressing the societal implications of artificial intelligence more generally. Previously, Fernando was the assistant managing director of Microsoft Research Montréal and a director of research at Spotify, where he helped establish its research organization on recommendation, search, and personalization. Fernando’s work has received awards at SIGIR, WSDM, ISCRAM, and ECIR. He is the recipient of the 2017 British Computer Society Karen Spärck Jones Award. Fernando has co-organized workshops and tutorials at SIGIR, WSDM, and WWW. He has also co-organized several NIST TREC initiatives, WSDM (2013), Strategic Workshop on Information Retrieval (2018), FAT* (2019), SIGIR (2021), and the CIFAR Workshop on Artificial Intelligence and the Curation of Culture (2019)

Brian St. Thomas is a Senior Data Scientist at Spotify researching online experimentation methods and metric development. His research interests are in the development and evaluation of personalized recommendation and search systems, with a focus on statistical aspects of these problems. Brian received his Ph.D. from Duke University, and was previously a Data Scientist with TiVo's Search and Recommendations division. Brian has published research in JASA, SIGIR, CHI, WWW and co-organized a tutorial at RecSys.

Affinity Workshop: Black in AI Mon 7 Dec 06:00 a.m.

Victor Silva · Flora Ponjou Tasse · Krystal Maughan · Eric Maigua · Charles Earl · Nwamaka (Amaka) Okafor · Ignatius Ezeani · Oloruntobiloba Olatunji · Foutse Yuehgoh · Salomey Osei · Ezinne Nwankwo · Joyce D. Williams

Black in AI exists to create a space for sharing ideas, foster collaborations, and discuss initiatives to increase the presence of Black individuals in the field of AI. To this end, we hold an annual technical workshop series, run mentoring programs, and maintain various fora for fostering partnerships and collaborations with and among black AI researchers. The 4th Black in AI workshop and 1st virtual Black in AI workshop will consist of selected oral presentations, invited keynote speakers, a joint poster session with other affinity groups, sponsorship sessions, and socials. Our workshop exists to amplify the voices of black researchers at NeurIPS.

Tutorial: Dustin Tran · Balaji Lakshminarayanan · Jasper Snoek

(Track2) Practical Uncertainty Estimation and Out-of-Distribution Robustness in Deep Learning

Deep learning models are bad at signalling failure: They tend to make predictions with high confidence, and this is problematic in real-world applications such as healthcare, self-driving cars, and natural language systems, where there are considerable safety implications, or where there are discrepancies between the training data and data that the model makes predictions on. There is a pressing need both for understanding when models should not make predictions and improving model robustness to natural changes in the data. This tutorial will give an overview of the landscape of uncertainty and robustness in deep learning. Namely, we examine calibration and out-of-distribution generalization as key tasks. Then we will go into a deep dive into promising avenues. This includes methods which average over multiple neural network predictions such as Bayesian neural nets, ensembles, and Gaussian processes; methods on the frontier of scale in terms of their overall parameter or prediction-time efficiency; and methods which encourage key inductive biases such as data augmentation. We ground these ideas in both empirical understanding and theory, and we provide practical recommendations with baselines and tips & tricks. Finally, we highlight open challenges in the field.

Bio s:

Dustin Tran is a research scientist at Google Brain. His research contributions examine the intersection of probability and deep learning, particularly in the areas of probabilistic programming, variational inference, giant models, and Bayesian neural networks. He completed his Ph.D. at Columbia under David Blei. He’s received awards such as the John M. Chambers Statistical Software award and the Google Ph.D. Fellowship in Machine Learning. He served as Area Chair at NeurIPS, ICML, ICLR, IJCAI, and AISTATS and organized "Approximate Inference" and "Uncertainty & Robustness" workshops at NeurIPS and UAI.

Balaji Lakshminarayanan is a research scientist at Google Brain. Prior to that, he was a research scientist at DeepMind. He received his PhD from the Gatsby Unit, University College London where he worked with Yee Whye Teh. His recent research has focused on probabilistic deep learning, specifically, uncertainty estimation, out-of-distribution robustness and deep generative models. Notable contributions relevant to the tutorial include developing state-of-the-art methods for calibration under dataset shift (such as deep ensembles and AugMix) and showing that deep generative models do not always know what they don't know. He has co-organized several workshops on "Uncertainty and Robustness in deep learning" and served as Area Chair for NeurIPS, ICML, ICLR and AISTATS.

Jasper Snoek is a research scientist at Google Brain. His research has touched a variety of topics at the intersection of Bayesian methods and deep learning. He completed his PhD in machine learning at the University of Toronto. He subsequently held postdoctoral fellowships at the University of Toronto, under Geoffrey Hinton and Ruslan Salakhutdinov, and at the Harvard Center for Research on Computation and Society, under Ryan Adams. Jasper co-founded a Bayesian optimization focused startup, Whetlab, which was acquired by Twitter. He has served as an Area Chair for NeurIPS, ICML, AISTATS and ICLR, and organized a variety of workshops at ICML and NeurIPS.

Tutorial: Sergey Levine · Aviral Kumar

(Track3) Offline Reinforcement Learning: From Algorithm Design to Practical Applications

Reinforcement learning (RL) provides a mathematical formalism for learning-based control that allows for acquisition of near-optimal behaviors by optimizing user-specified reward functions. While RL methods have received considerable attention recently due to impressive applications in many areas, the fact that RL requires a fundamentally online learning paradigm is one of the biggest obstacles to its widespread adoption. Online interaction is often impractical, because data collection is expensive (e.g., in robotics, or educational agents) or dangerous (e.g., in autonomous driving, or healthcare). An alternate approach is to utilize RL algorithms that effectively leverage previously collected experience without requiring online interaction. This has been referred to as batch RL, offline RL, or data-driven RL. Such algorithms hold tremendous promise for making it possible to turn datasets into powerful decision-making engines, similarly to how datasets have proven key to the success of supervised learning in vision and NLP. In this tutorial, we aim to provide the audience with the conceptual tools needed to both utilize offline RL as a tool, and to conduct research in this exciting area. We aim to provide an understanding of the challenges in offline RL, particularly in the context of modern deep RL methods, and describe some potential solutions that have been explored in recent work, along with applications. We will present classic and recent methods in a way that is accessible for practitioners, and also discuss the theoretical foundations for conducting research in this field. We will conclude with a discussion of open problems.

Bio s:

Sergey Levine

Sergey Levine received a BS and MS in Computer Science from Stanford University in 2009, and a Ph.D. in Computer Science from Stanford University in 2014. He joined the faculty of the Department of Electrical Engineering and Computer Sciences at UC Berkeley in fall 2016. His work focuses on machine learning for decision making and control, with an emphasis on deep learning and reinforcement learning algorithms. Applications of his work include autonomous robots and vehicles, as well as applications in other decision-making domains. His research includes developing algorithms for end-to-end training of deep neural network policies that combine perception and control, scalable algorithms for inverse reinforcement learning, deep reinforcement learning algorithms, and more

Affinity Workshop: LXAI Research @ NeurIPS 2020 Mon 7 Dec 08:00 a.m.

Maria Luisa Santiago · Laura Montoya · Pedro Braga · Karla Caballero Barajas · Sergio H Garrido Mejia · Eduardo Moya · Vinicius Caridá · Ariel Ruiz-Garcia · Ivan Arraut · Juan Banda · Josue Caro · Gissella Bejarano Nicho · Fabian Latorre · Carlos Miranda · Ignacio Lopez-Francos

The workshop is a one-day event with invited speakers, oral presentations, and posters. The event brings together faculty, graduate students, research scientists, and engineers for an opportunity to connect and exchange ideas. There will be a panel discussion and a mentoring session to discuss current research trends and career choices in artificial intelligence and machine learning. While all presenters will identify primarily as Latinx, all are invited to attend.

Tutorial: Yingzhen Li · Cheng Zhang

(Track1) Advances in Approximate Inference

Bayesian probabilistic modelling provides a principled framework for coherent inference and prediction under uncertainty. Approximate inference addresses the key challenge of Bayesian computation, that is, the computation of the intractable posterior distribution and related quantities such as the Bayesian predictive distribution. Significant progress has been made in this field during the past 10 years, which enables a wide application of Bayesian modelling techniques to machine learning tasks in computer vision, natural language processing, reinforcement learning etc.

This tutorial offers a coherent summary of the recent advances in approximate inference. We will start the tutorial with an introduction to the approximate inference concept and the basics in variational inference. Then we will describe the fundamental aspects of the modern approximate inference, including scalable inference, Monte Carlo techniques, amortized inference, approximate posterior design, and optimisation objectives. The connections between these recent advances will also be discussed. Lastly, we will provide application examples of advanced approximate inference techniques to downstream uncertainty estimation and decision-making tasks and conclude with a discussion on future research directions.

Timetable Tutorial part 1: basics of approximate inference (approx. 30min) Coffee break & live Q&A 1 (approx. 10min) Tutorial part 2: advances 1 (approx. 30min) Coffee break & live Q&A 2 (approx. 10min) Tutorial part 3: advances 2 (approx. 30min) Coffee break & live Q&A 3 (approx. 10min) Tutorial part 3: applications (approx. 30min)

Bio s:

Yingzhen Li is a senior researcher at Microsoft Research Cambridge. She received her PhD from the University of Cambridge, and previously she has interned at Disney Research. She is passionate about building reliable machine learning systems, and her approach combines both Bayesian statistics and deep learning. Her contributions to the approximate inference field include: (1) algorithmic advances, such as variational inference with different divergences, combining variational inference with MCMC and approximate inference with implicit distributions; (2) applications of approximate inference, such as uncertainty estimation in Bayesian neural networks and algorithms to train deep generative models. She has served as area chairs at NeurIPS/ICML/ICLR/AISTATS on related research topics, and she is a co-organizer of the AABI2020 symposium, a flagship event of approximate inference.

Cheng Zhang is a principal researcher at Microsoft Research Cambridge, UK. She leads the Data Efficient Decision Making (Project Azua) team in Microsoft. Before joining Microsoft, she was with the statistical machine learning group of Disney Research Pittsburgh, located at Carnegie Mellon University. She received her Ph.D. from the KTH Royal Institute of Technology. She is interested in advancing machine learning methods, including variational inference, deep generative models, and sequential decision-making under uncertainty; and adapting machine learning to social impactful applications such as education and healthcare. She co-organized the Symposium on Advances in Approximate Bayesian Inference from 2017 to 2019.

Tutorial: Francois Chollet · Melanie Mitchell · Christian Szegedy

(Track1) Abstraction & Reasoning in AI systems: Modern Perspectives

In this tutorial, we will provide modern perspectives on abstraction and reasoning in AI systems. Traditionally, symbolic and probabilistic methods have dominated the domains of concept formation, abstraction, and automated reasoning. More recently, deep learning-based approaches have led to breakthroughs in some domains, like tackling hard search problems such as games and combinatorial search tasks. However, the resulting systems are still limited in scope and capabilities, especially in producing interpretable results and verifiable abstractions. Here, we will address a set of questions: Why is an ability for conceptual abstraction essential for intelligence, in both humans and machines? How can we get machines to learn flexible and extendable concepts that can transfer between domains? What do we understand by "strong reasoning capabilities" and how do we measure these capabilities in AI systems? How do deep learning-based methods change the landscape of computer-assisted reasoning? What are the failure modes of such methods and possible solutions to these issues?

Schedule 7:00pm - 7:40pm UTC Speaker: Francois Chollet Title: Why abstraction is the key, and what we're still missing

7:40pm - 7:50pm UTC Questions

7:50pm - 8:30pm UTC Speaker: Melanie Mitchell Title: Mechanisms of abstraction and analogy in natural and artificial intelligence

8:30pm - 8:40pm UTC Questions

8:40pm - 9:20pm UTC Speaker: Christian Szegedy Title: Deep learning for mathematical reasoning

9:20pm - 9:30pm UTC Questions

Bio s:

Francois Chollet is a software engineer at Google, where he leads the team that makes Keras, a major deep learning framework. He is the author of numerous publications in the field of deep learning, including a best-selling textbook. His current research focuses on abstraction generation, analogical reasoning, and how to achieve greater generality in artificial intelligence.

Melanie Mitchell

Melanie Mitchell is a professor at the Santa Fe Institute. Her current research focuses on conceptual abstraction, analogy-making, and visual recognition in artificial intelligence systems. Melanie is the author or editor of six books and numerous scholarly papers in the fields of artificial intelligence, cognitive science, and complex systems. Her latest book is Artificial Intelligence: A Guide for Thinking Humans (Farrar, Straus, and Giroux).

Christian Szegedy is a Machine Learning scientist at Google Research. He has a PhD in Mathematics from the University of Bonn, Germany. His most influential past works include the discovery of adversarial examples and various computer vision architectures for image recognition and object detection. He is the co-inventor of Batch-normalization. He is currently working on automated theorem proving and auto-formalization of mathematics via deep learning.

Tutorial: Sham M Kakade · Martha White · Nicolas Le Roux

(Track3) Policy Optimization in Reinforcement Learning

This tutorial will cover policy gradients methods in reinforcement learning, with a focus on understanding foundational ideas from an optimization perspective. We will discuss the properties of the policy objective, in terms of two critical properties for convergence rates when using stochastic gradient approaches: variance and curvature. We will explain how the policy objective can be a particularly difficult optimization problem, as it can have large flat regions and stochastic samples of the gradient can be very high variance. We will first explain how to use standard tools from optimization to reduce the variance of the gradient estimate, as well as techniques to mitigate curvature issues. We will then discuss optimization improvements that leverage more knowledge about the objective, including the Markov property and how to modify the state distribution for more coverage. We will discuss how standard Actor-Critic methods with (off-policy) data re-use provide RL-specific variance reduction approaches. We will then conclude with an overview of what is known theoretically about the policy objective, where we discuss the role of entropy-regularization and exploration for mitigating curvature issues.

The tutorial website is

Bio s:

Tutorial: David W Hogg · Kate Storey-Fisher

(Track2) Machine Learning for Astrophysics and Astrophysics Problems for Machine Learning

The field of astrophysics has been an avid consumer—and also a developer—of new methods in data science (maybe even dating back to the invention of Bayesian inference). With constantly growing data volumes, increasingly complex and costly physical models, and demand for extremely precise measurements, astrophysics presents opportunities for innovation in machine learning (ML) methods.

In this tutorial, we will give a sense of the myriad connections between astrophysics and ML, and demonstrate that astrophysics is an ideal sandbox for developing and testing ML applications and innovations. We will also discuss areas where vanilla ML methods fail or require extension or elaboration to be competitive with traditional astronomy techniques.

Astronomical data falls into four broad types: imaging, spectroscopy, time series, and catalogs. We will discuss the scientific understandings and precise measurements that we hope to obtain from these data sets, the challenges specific to each of them, and the successes and opportunities for ML applications in these domains. We will demonstrate how to obtain and start working with current leading-edge public data sets of each type. Participants should expect to do hands-on work with the data during the tutorial (we’ll demo with Python and Jupyter, but any platform can play). By the end, we hope that participants will be able to download, visualize, and apply ML algorithms to astronomical data, in ways relevant to current research directions in astrophysics. DWH and KSF thank the members of the Astronomical Data Group at the Flatiron Institute for support with the ideas, code, and content in this tutorial.

Bio s:

David W Hogg is Professor of Physics and Data Science at New York University and Group Leader of the Astronomical Data Group at the Flatiron Institute. His work is on computational data analysis in all areas of astrophysics, from extra-solar planet discovery to mapping the dark matter to measuring the expansion history of the Universe.

Kate Storey-Fisher is a PhD candidate in Physics at New York University and a NASA FINESST Fellow. Her research is on the large-scale structure of the universe, focusing on statistical and data-science methods for observational cosmology.

Affinity Poster Session: Joint Affinity Groups Poster Session Mon 7 Dec 12:30 p.m.

The Joint Affinity Groups poster session is a collaborative event between Black in AI, Indigenous in AI, LatinX in AI, Queer in AI, and Women in Machine Learning. This joint poster session will feature 190 posters across a variety of topics across machine learning. Please join us in Gather.Town!

Program book

Tutorial: David Duvenaud · J. Zico Kolter · Matthew Johnson

(Track3) Deep Implicit Layers: Neural ODEs, Equilibrium Models, and Differentiable Optimization

Virtually all deep learning is built upon the notion of explicit computation: layers of a network are written in terms of their explicit step-by-step computations used to map inputs to outputs. But a rising trend in deep learning takes a different approach: implicit layers, where one instead specifies the conditions for a layer’s output to satisfy. Such architectures date back to early work on recurrent networks but have recently gained a great deal of attention as the approach behind Neural ODEs, Deep Equilibrium Models (DEQs), FFJORD, optimization layers, SVAEs, implicit meta-learning, and many other approaches. These methods can have substantial conceptual, computational, and modeling benefits: they often make it much easier to specify simple-yet-powerful architectures, can vastly reduce the memory consumption of deep networks, and allow more natural modeling of e.g. continuous-time phenomena.

This tutorial will provide a unified perspective on implicit layers, illustrating how the implicit modeling framework encompasses all the models discussed above, and providing a practical view of how to integrate such approaches into modern deep learning systems. We will cover the history and motivation of implicit layers, discuss how to solve the resulting "forward" inference problem, and then highlight how to compute gradients through such layers in the backward pass, via implicit differentiation. Throughout, we will highlight several applications of these methods in Neural ODEs, DEQs, and other settings. The tutorial will be accompanied by an interactive monograph on implicit layers: a set of interactive Colab notebooks with code in both the JAX and PyTorch libraries.

Bio s:

David Duvenaud is an assistant professor in computer science at the University of Toronto. His research focuses on continuous-time models, latent-variable models, and deep learning. His postdoc was done at Harvard University, and his Ph.D. at the University of Cambridge. David also co-founded Invenia, an energy forecasting and trading company.

Zico Kolter is an Assistant Professor in the School of Computer Science at Carnegie Mellon University, and also serves as Chief Scientist of AI Research for the Bosch Center for Artificial Intelligence. His work focuses on the intersection of machine learning and optimization, with a large focus on developing more robust, explainable, and rigorous methods in deep learning. In addition, he has worked on a number of application areas, highlighted by work on sustainability and smart energy systems. He is the recipient of the DARPA Young Faculty Award, and best paper awards at KDD, IJCAI, and PESGM.

Matt Johnson is a research scientist at Google Brain interested in software systems powering machine learning research. He is the tech lead for JAX, a system for composable function transformations in Python. He was a postdoc at Harvard University with Ryan Adams, working on composing graphical models with neural networks and applications in neurobiology. His Ph.D. is from MIT, where he worked with Alan Willsky on Bayesian nonparametrics, time series models, and scalable inference.

Tutorial: Brendan McMahan · Virginia Smith · Peter Kairouz

(Track1) Federated Learning and Analytics: Industry Meets Academia

Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. Similarly, federated analytics (FA) allows data scientists to generate analytical insight from the combined information in distributed datasets without requiring data centralization. Federated approaches embody the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs resulting from traditional, centralized machine learning and data science approaches.

Motivated by the explosive growth in federated learning and analytics research, this tutorial will provide a gentle introduction to the area. The focus will be on cross-device federated learning, including deep dives on federated optimization and differentially privacy, but federated analytics and cross-silo federated learning will also be discussed. In addition to optimization and privacy, we will also introduce personalization, robustness, fairness, and systems challenges in the federated setting with an emphasis on open problems.

Bio s:

Peter Kairouz is a Google Research Scientist working on decentralized, privacy-preserving, and robust machine learning algorithms. Prior to Google, his research largely focused on building decentralized technologies for anonymous broadcasting over complex networks, understanding the fundamental trade-off between differential privacy and utility of learning algorithms, and leveraging state-of-the-art deep generative models for data-driven privacy and fairness.

Tutorial: Himabindu Lakkaraju · Julius Adebayo · Sameer Singh

(Track2) Explaining Machine Learning Predictions: State-of-the-art, Challenges, and Opportunities

As machine learning is deployed in all aspects of society, it has become increasingly important to ensure stakeholders understand and trust these models. Decision makers must have a clear understanding of the model behavior so they can diagnose errors and potential biases in these models, and decide when and how to employ them. However, most accurate models that are deployed in practice are not interpretable, making it difficult for users to understand where the predictions are coming from, and thus, difficult to trust.

Recent work on explanation techniques in machine learning offers an attractive solution: they provide intuitive explanations for “any” machine learning model by approximating complex machine learning models with simpler ones.

In this tutorial, we will discuss several post hoc explanation methods, and focus on their advantages and shortcomings. We will cover three families of techniques: (a) single instance gradient-based attribution methods (saliency maps), (b) model agnostic explanations via perturbations, such as LIME/SHAP and counterfactual explanations, and (c) surrogate modeling for global interpretability, such as MUSE. For each of these approaches, we will provide their problem setup, prominent methods, example applications, and finally, discuss their vulnerabilities and shortcomings. We will conclude the tutorial with an overview of future directions and a discussion on open research problems. We hope to provide a practical and insightful introduction to explainability in machine learning.

Bio s:

Hima Lakkaraju is an Assistant Professor at Harvard University focusing on explainability, fairness, and robustness of machine learning models. She has also been working with various domain experts in criminal justice and healthcare to understand the real world implications of explainable and fair ML. Hima has recently been named one of the 35 innovators under 35 by MIT Tech Review, and has received best paper awards at SIAM International Conference on Data Mining (SDM) and INFORMS. She has given invited workshop talks at ICML, NeurIPS, AAAI, and CVPR, and her research has also been covered by various popular media outlets including the New York Times, MIT Tech Review, TIME, and Forbes. For more information, please visit: https://himalakkaraju.github.io/

Julius Adebayo is a Ph.D. student at MIT working on developing and understanding approaches that seek to make machine learning-based systems reliable when deployed. More broadly, he is interested in rigorous approaches to help develop models that are robust to spurious associations, distribution shifts, and align with 'human' values. Website: https://juliusadebayo.com/

Sameer Singh is an Assistant Professor at UC Irvine working on robustness and interpretability of machine learning. Sameer has presented tutorials and invited workshop talks at EMNLP, Neurips, NAACL, WSDM, ICLR, ACL, and AAAI, and received paper awards at KDD 2016, ACL 2018, EMNLP 2019, AKBC 2020, and ACL 2020. Website: http://sameersingh.org/

Invited Talk: Charles Isbell

You Can’t Escape Hyperparameters and Latent Variables: Machine Learning as a Software Engineering Enterprise

Successful technological fields have a moment when they become pervasive, important, and noticed. They are deployed into the world and, inevitably, something goes wrong. A badly designed interface leads to an aircraft disaster. A buggy controller delivers a lethal dose of radiation to a cancer patient. The field must then choose to mature and take responsibility for avoiding the harms associated with what it is producing. Machine learning has reached this moment.

In this talk, I will argue that the community needs to adopt systematic approaches for creating robust artifacts that contribute to larger systems that impact the real human world. I will share perspectives from multiple researchers in machine learning, theory, computer perception, and education; discuss with them approaches that might help us to develop more robust machine-learning systems; and explore scientifically interesting problems that result from moving beyond narrow machine-learning algorithms to complete machine-learning systems.

Bio :

Charles Isbell

Dr. Charles Isbell received his bachelor's in Information and Computer Science from Georgia Tech, and his MS and PhD at MIT's AI Lab. Upon graduation, he worked at AT&T Labs/Research until 2002, when he returned to Georgia Tech to join the faculty as an Assistant Professor. He has served many roles since returning and is now The John P. Imlay Jr. Dean of the College of Computing. Charles’s research interests are varied but the unifying theme of his work has been using machine learning to build autonomous agents who engage directly with humans. His work has been featured in the popular press, congressional testimony, and in several technical collections. In parallel, Charles has also pursued reform in computing education. He was a chief architect of Threads, Georgia Tech’s structuring principle for computing curricula. Charles was also an architect for Georgia Tech’s First-of-its’s-kind MOOC-supported MS in Computer Science. Both efforts have received international attention, and been presented in the academic and popular press. In all his roles, he has continued to focus on issues of broadening participation in computing, and is the founding Executive Director for the Constellations Center for Equity in Computing. He is an AAAI Fellow and a Fellow of the ACM. Appropriately, his citation for ACM Fellow reads “for contributions to interactive machine learning; and for contributions to increasing access and diversity in computing”.

Orals & Spotlights Track 03: Language/Audio Applications Mon 7 Dec 06:00 p.m.

Anshumali Shrivastava · Dilek Hakkani-Tur

Show detail for Orals & Spotlights Track 03: Language/Audio Applications

Language Models are Few-Shot Learners

Tom B Brown · Benjamin Mann · Nick Ryder · Melanie Subbiah · Jared Kaplan · Prafulla Dhariwal · Arvind Neelakantan · Pranav Shyam · Girish Sastry · Amanda Askell · Sandhini Agarwal · Ariel Herbert-Voss · Gretchen M Krueger · Tom Henighan · Rewon Child · Aditya Ramesh · Daniel Ziegler · Jeffrey Wu · Clemens Winter · Chris Hesse · Mark Chen · Eric Sigler · Mateusz Litwin · Scott Gray · Benjamin Chess · Jack Clark · Christopher Berner · Sam McCandlish · Alec Radford · Ilya Sutskever · Dario Amodei

[ Orals & Spotlights: Language/Audio Applications ]

Tutorial: Pascale N Fung · Yun-Nung (Vivian) Chen · Zhaojiang Lin · Andrea Madotto

Expo Demonstration: GAN Applications in Fashion Article Design and Outfit Rendering Mon 7 Dec 12:00 a.m.

Tutorial: Jelani Nelson

Tutorial: Risi Kondor · Taco Cohen

Tutorial: Marc Deisenroth · Cheng Soon Ong

Tutorial: Marta Garnelo · David Balduzzi · Wojciech Czarnecki

Affinity Workshop: New In ML Mon 7 Dec 03:00 a.m.

Tutorial: Jane Wang · Kevin Miller · Adam Marblestone

Tutorial: Praveen Chandar · Fernando Diaz · Brian St. Thomas

Affinity Workshop: Black in AI Mon 7 Dec 06:00 a.m.

Tutorial: Dustin Tran · Balaji Lakshminarayanan · Jasper Snoek

Tutorial: Sergey Levine · Aviral Kumar

Affinity Workshop: LXAI Research @ NeurIPS 2020 Mon 7 Dec 08:00 a.m.

Tutorial: Yingzhen Li · Cheng Zhang

Tutorial: Francois Chollet · Melanie Mitchell · Christian Szegedy

Tutorial: Sham M Kakade · Martha White · Nicolas Le Roux

Tutorial: David W Hogg · Kate Storey-Fisher

Affinity Poster Session: Joint Affinity Groups Poster Session Mon 7 Dec 12:30 p.m.

Tutorial: David Duvenaud · J. Zico Kolter · Matthew Johnson

Tutorial: Brendan McMahan · Virginia Smith · Peter Kairouz

Tutorial: Himabindu Lakkaraju · Julius Adebayo · Sameer Singh

Invited Talk: Charles Isbell

Orals & Spotlights Track 03: Language/Audio Applications Mon 7 Dec 06:00 p.m.

Orals & Spotlights Track 01: Representation/Relational Mon 7 Dec 06:00 p.m.

Orals & Spotlights Track 02: COVID/Health/Bio Applications Mon 7 Dec 06:00 p.m.

Orals & Spotlights Track 04: Reinforcement Learning Mon 7 Dec 06:00 p.m.

Poster Session 1 Mon 7 Dec 09:00 p.m.