ELLIS Scientific Symposium
Location: MPI-IS Tübingen in N0.002
All current employees of the Max Planck Institute for Intelligent Systems and partner institutions (like the AI Center at the University) are welcome to attend this event.
If you have any questions, please contact Carmela Rianna, ELLIS Research Coordinator, at carmela.rianna@tuebingen.mpg.de
Friday March 31st |
|
---|---|
Time | Activity |
16:30-17:15 |
Christian Schroeder de Witt (Website)University of Oxford Bringing Multi-Agent Learning to Societal Impact: From Steganography to Solar Geoengineering Chair: Bernhard Schölkopf Abstract and speaker’s biography >> AbstractIn recent years, significant progress has been made in the field of deep multi-agent learning, particularly in recreational games like Go, Dota 2, or StarCraft. However, in this talk, I aim to focus on its potential for generating societal impact. I will present a range of applications, from my recent breakthrough research that yielded the world's first perfectly secure generative steganography algorithm to the robustification of Human-AI systems against illusory attacks and optimizing solar geoengineering. Through these examples, I will demonstrate how multi-agent learning can raise new questions and provide answers to some of the biggest societal challenges we face today. BiographyDr. Christian Schroeder de Witt is an artificial intelligence researcher specialising in fundamental research on multi-agent control in high-dimensional settings. He has authored a variety of highly influential research works, and is pioneering numerous real-world applications of deep multi-agent reinforcement learning, ranging from steganography to climate-economics models and solar geoengineering. As part of various industry collaborations, Christian has previously worked on A.I for autonomous drone control, as well as automated cybersecurity defence systems. Christian currently holds a postdoctoral researcher role at the University of Oxford, UK. |
Monday April 3rd |
||
---|---|---|
Time | Activity | |
09:15-09:30 |
Welcome by Bernhard Schölkopf |
|
09:30-10:15 |
Antonio Orvieto (Website)ETH Zürich Optimization Challenges in Modern Deep Models for Long-range Reasoning Chair: Bernhard Schölkopf Abstract and speaker’s biography >> AbstractState-of-the-art architectures for sequence generation and understanding, based on attention or recurrent units, are often hard to optimize and tune to optimal performance. In this talk, we focus on understanding these challenges in the recently proposed S4 model: a successful deep transformer-like architecture introduced in 2022 that utilizes linear continuous-time dynamical systems as token-mixing components. S4 achieves state-of-the-art performance in the long-range arena, a Google benchmark for sequence classification, surpassing attention models by a large margin. However, despite being motivated by the theory of optimal polynomial projections, S4's superior performance compared to simpler deep recurrent models, such as deep LSTMs, has perplexed researchers. Drawing on insights from optimization theory and Koopman operators, we are able to identify the primary sources of S4 success and to retrieve its performance with a much simpler architecture which we call Linear Recurrent Unit (LRU). The insights derived in our work provide a set of best practices for initialization and parametrization of fast and accurate modular architectures for understanding and generation of long sequence data (e.g. for NLP, genomics, music generation, etc.) To conclude, we connect the optimization problems faced in S4 to those of other modern networks (e.g., large language models) and outline a few directions for future investigations. https://arxiv.org/abs/2303.06349 BiographyMy name is Antonio, and I come from the beautiful city of Venice, Italy. I am a last-year Ph.D. student at ETH Zurich, supervised by Thomas Hoffman. My research focuses on the intersection between mathematical optimization and large-scale machine learning. Specifically, my work centers around the theory of accelerated optimization, the dynamics and generalization properties of stochastic gradient descent in high-dimensional non-convex models, and the interaction between adaptive optimizers and the structure/initialization of deep neural networks. I am highly excited by the impact of deep learning on the future of science and technology. My goal is to assist scientists and engineers by contributing to the development of a deep-learning toolbox that is theoretically sound, clearly outlining architectural choices and the best practices for training. To this end, my research experiences at DeepMind, Meta, MILA, and INRIA Paris allowed me to gain valuable insights into the practical challenges associated with training modern deep networks. In the future, I aim to further combine my passion for applications and theoretical background by developing innovative, architecture-aware optimizers that can enhance generalization and accelerate training in various classes of deep learning models, with a particular focus on sequence modeling and understanding. |
|
10:15-11:00 |
Celestine Mendler-Dünner (Website)Max Planck Institute for Intelligent Systems Machine Learning at Societal Scale: on the Role of Performativity, Power and Collective Action Chair: Bernhard Schölkopf Abstract and speaker’s biography >> AbstractMachine learning powers services, platforms, infrastructure, and markets at societal scale. This poses new challenges on the design and study of machine learning systems. In this talk, I will start by focusing on the social consequences of prediction at scale. I will introduce a risk minimization framework, called performative prediction, that conceptualizes the impact of predictive models on populations. Given this technical tool, I will discuss two emerging optimization results that follow from the dynamic perspective on the learning problem. In the second part of the talk, I will discuss the role of economic power in machine learning, and how it changes the meaning of optimization targets. I will propose a tool to measure power in digital economies, and discuss algorithmic collective action as an avenue to counter power imbalances from the perspective of the population. Finally, I will conclude by discussing my broader ambition to build theory and tools that contribute towards a healthy digital ecosystem. BiographyCelestine Mendler-Dünner is a research group leader at the Max Planck Institute for Intelligent Systems in Tübingen. Prior to joining MPI she spent two years as a SNSF Postdoctoral fellow at the University of California, Berkeley, hosted by Moritz Hardt. She conducted her PhD in Computer Science at ETH Zurich, co-affiliated with IBM Research, and advised by Thomas Hofmann. Her research interests revolve around machine learning at societal scale, from a perspective of building scalable systems, as well as understanding their impact on society. She has been a lead developer of the IBM Snap ML library, and her dissertation was awarded the ETH medal and the Fritz Kutter Prize. |
|
11:00-11:30 |
Break |
|
11:30-12:15 |
Mirella Lapata (Website)University of Edinburgh Conditional Generation with a Question-Answering Blueprint Chair: Moritz Hardt Abstract and speaker’s biography >> AbstractThe ability to convey relevant and faithful information is critical for many tasks in conditional generation and yet remains elusive for neural sequence-to-sequence models whose outputs often reveal hallucinations and fail to correctly cover important details. In this work, we advocate planning as a useful intermediate representation for rendering conditional generation less opaque and more grounded. Our work proposes a new conceptualization of text plans as a sequence of question-answer (QA) pairs. We enhance existing datasets (e.g., for summarization) with a QA blueprint operating as a proxy for both content selection (i.e.,what to say) and planning (i.e., in what order). We obtain blueprints automatically by exploiting state-of-the-art question generation technology and convert input-output pairs into input-blueprint-output tuples. We develop Transformer-based models, each varying in how they incorporate the blueprint in the generated output (e.g., as a global plan or iteratively). Evaluation across metrics and datasets demonstrates that blueprint models are more factual than alternatives which do not resort to planning and allow tighter control of the generation output BiographyMirella Lapata is professor of natural language processing in the School of Informatics at the University of Edinburgh. Her research focuses on getting computers to understand, reason with, and generate natural language. She is the first recipient (2009) of the British Computer Society and Information Retrieval Specialist Group (BCS/IRSG) Karen Sparck Jones award and a Fellow of the Royal Society of Edinburgh, the ACL, and Academia Europaea. |
|
12:15-13:00 |
Francesco Locatello (Website)Amazon AWS Tübingen Lablet Causal Representation Learning Chair: Moritz Hardt Abstract and speaker’s biography >> AbstractCausal representation learning tackles the problem of discovering high-level variables and their relations from low-level observations and aligns with the general goal of learning meaningful data representations that are also robust, explainable, and fair. In this talk, I will discuss opportunities and challenges in discovering latent structure and causal relations from data. First, I will discuss the identifiability of causal and disentangled representations. Second, I will question whether neural networks can represent abstract causal variables and introduce Slot Attention, an architectural interface between distributed representations and sets of high-level variables. Third, I will explore how advances in machine learning enable a new generation of causal discovery algorithms. Finally, I will present my future plans for causal representation learning and, more generally, broadening the applicability of causal models in machine learning. BiographyDr. Francesco Locatello is a Senior Applied Scientist at Amazon AWS, where he leads the Causal Representation Learning research team. He obtained his Ph.D. at ETH Zurich, supervised by Gunnar Rätsch (ETH Zurich) and Bernhard Schölkopf (Max Planck Institute for Intelligent Systems) in 2020. He held doctoral fellowships at the Max Planck ETH Center for Learning Systems and at ELLIS. In addition, he received the Google Ph.D. Fellowship in Machine Learning in 2019. His research has won several awards, including the best paper award at ICML 2019, the ETH medal for outstanding doctoral dissertation, and the 2023 Hector-Stiftung prize. |
|
13:00-13:45 |
Break |
|
13:45-14:30 |
Johannes Kirschner (Website)University of Alberta Optimal Decision-Making with Information-Directed Sampling Chair: Matthias Hein Abstract and speaker’s biography >> AbstractSequential decision-making is a machine-learning paradigm in which a learning system interacts with an environment by taking actions and receiving feedback. This abstraction broadly applies to a wide range of practical problems of significant interest, e.g. in control of physical systems, data-driven optimization and machine teaching. A key aspect of sequential decision-making is to optimally balance the cost of acquiring data and the value it provides to help future decisions. Information-directed sampling (IDS) is a versatile algorithm for sequential decision-making that directly optimizes this information trade-off. In this talk, I introduce the frequentist IDS framework and review recent advances in the theory of information-directed sampling. By establishing a connection to primal-dual optimization, I formally show that IDS is the first approach that achieves both instance-dependent and worst-case optimality in the linear bandit model. Moreover, IDS naturally extends to more general feedback models, thereby covering a wide range of standard settings (e.g. ranking feedback) as a special case — all with a single, practical algorithm and a unified analysis. I will finish the talk by outlining my future plans in this emerging area of research. BiographyJohannes Kirschner is a postdoc fellow with Prof. Csaba Szepesvári at the University of Alberta, supported by an "Early Postdoc.Mobility fellowship" of the Swiss National Foundation. Before joining the University of Alberta, Johannes obtained his PhD at ETH Zurich with Prof. Andreas Krause. Johannes' research focuses on reinforcement learning algorithms, experimental design and data-driven decision-making. His work is widely recognized at top venues (COLT, NeurIPS, ICML,...) and spans theoretical foundations to challenging real-world applications. Johannes actively serves as a reviewer at major machine learning conferences and journals, and he acted as an associate chair for ICML 2022. |
|
14:30-15:15 |
Jonas Geiping (Website)University of Maryland Why and When is Machine Learning Unsafe? Chair: Matthias Hein Abstract and speaker’s biography >> AbstractThis talk will be an overview over three topics of security&safety in ML. I first want to discuss security issues during model training, such in the popular decentralized training paradigm of federated learning. I will be discussing user-centric privacy in federated learning and show attacks that breach privacy and recover user data. BiographyJonas works as postdoctoral researcher in computer science at the University of Maryland, College Park. His background is in Mathematics, more specifically in mathematical optimization and he is now interested not only in research that intersects current deep learning and mathematical optimization in general, but especially in the implications of optimization in machine learning for the design of secure and private ML systems. |
|
15:15-15:45 |
Break |
|
15:45-16:30 |
Frank Hutter (Website)University of Freiburg Deep Learning 2.0: Meta-Learning the Next Generation of Learning Methods Chair: Katherine J. Kuchenbecker Abstract and speaker’s biography >> AbstractThroughout the history of AI, there is a clear pattern that manual elements of AI methods are eventually replaced by better-performing automatically-found ones; for example, deep learning (DL) replaced manual feature engineering with learned representations. The logical next step in representation learning is to also (meta-)learn the best architectures for these representations, as well as the best algorithms for learning them. In this talk, I will discuss several works along these lines from the field of automated machine learning (AutoML). Specifically, I will discuss the efficiency of AutoML, its relationship to foundation models, its ability to democratize machine learning, and that it can also be extended to optimize various dimensions of trustworthiness (such as algorithmic fairness, robustness, and uncertainty calibration). Finally, taking the idea of meta-learning to the extreme, I will deep-dive into a novel approach that learns an entire classification algorithm for small tabular datasets that achieves a new state of the art at the cost of a single forward pass. BiographyFrank Hutter is a Full Professor for Machine Learning at the University of Freiburg (Germany), as well as Chief Expert AutoML at the Bosch Center for Artificial Intelligence. Frank holds a PhD from the University of British Columbia (UBC, 2009). He received the 2010 CAIAC doctoral dissertation award for the best thesis in AI in Canada, as well as several best paper awards and prizes in international ML competitions. He is a Fellow of ELLIS and EurAI, Director of the ELLIS unit Freiburg, and the recipient of 3 ERC grants. Frank is best known for his research on automated machine learning (AutoML), including neural architecture search, efficient hyperparameter optimization, and meta-learning. He co-authored the first book on AutoML and the prominent AutoML tools Auto-WEKA, Auto-sklearn and Auto-PyTorch, won the first two AutoML challenges with his team, co-organized 15 AutoML-related workshops at ICML, NeurIPS and ICLR, and founded the AutoML conference as general chair in 2022. |
|
16:30-17:15 |
Georgios Pavlakos (Website)UC Berkeley Perceiving Humans in 4D Chair: Katherine J. Kuchenbecker Abstract and speaker’s biography >> AbstractFrom the moment we open our eyes, we are surrounded by people. By observing the people around us, we learn how to interact with them and the world. To create intelligent agents with similar capabilities, it is crucial to endow them with a perceptual system that can interpret and understand human behavior from visual observations. These observations are streams of two-dimensional images, however, the actual underlying state of humans is 4D - they have 3D bodies that move over time. In this talk, I will present my work on perceiving humans in 4D from video. This includes estimating their articulated 3D body pose, tracking them over time and recovering a 4D reconstruction that is consistent with their spatial environment. I will highlight the limitations of systems that only operate in the space of image pixels and showcase the benefits of reasoning in 4D. BiographyGeorgios Pavlakos is a Postdoctoral Scholar at UC Berkeley, advised by Angjoo Kanazawa and Jitendra Malik. His research interests include computer vision, machine learning and robotics. He completed his PhD in Computer Science at the University of Pennsylvania with his advisor, Kostas Daniilidis. He has spent time at Max Planck Institute with Michael Black and at Facebook Reality Labs. His PhD dissertation received the Morris and Dorothy Rubinoff Award for the Best Computer Science Dissertation at UPenn. |
Tuesday April 4th |
|
---|---|
Time | Activity |
09:15-10:00 |
Markus WulfmeierGoogle DeepMind Robot Learning in the Age of Foundation Models Chair: Bernhard Schölkopf Abstract and speaker’s biography >> AbstractThe recent, vast progress in artificial intelligence, in particular deep learning for vision and language processing, has not been fully matched in the control of real-world systems, such as autonomous vehicles and other robotic platforms. While we have made strides on the perception side based on knowledge extracted from immense, web-based datasets, the decision making problem remains a challenge. This talk focuses on a critical limitation: the availability of diverse and relevant data for control. I will discuss methods reinforcing the acquisition of data and the transfer of data-driven knowledge to address this problem. We will conclude with questions about the role of robot learning at a time of ever-growing datasets and machine learning models. BiographyMarkus Wulfmeier is a researcher at Google DeepMind focusing on machine learning for robotics with a focus on data-driven knowledge transfer. His work aims at efficiently scalable algorithms applicable across a variety of robotic platforms including quadrupeds, bipeds, and different configurations of robotic arms. He has been a postdoctoral research scientist at the Oxford Robotics Institute as well as a member of Oxford University’s New College, where he completed his PhD. Over the years, he previously held visiting scholar positions with UC Berkeley, the Massachusetts Institute of Technology, and the Swiss Federal Institute of Technology. His work received multiple awards including the Best Student Paper award at IROS 2016. |
Organizers
Tübingen | |
+49 7071 601 551 | |