Combinatorial Optimization as a Layer / Blackbox Differentiation

Deep learning has shown remarkable success in solving problems from high-dimensional raw input, such as image recognition or speech-to-text translation. However, deep learning is not particularly good when it comes to problems with a combinatorial nature, such as solving a puzzle, finding the shortest path or matching two sets of items. For each of these problems, computer scientists have developed highly optimized algorithms with correspondingly fast implementations in the field of combinatorial optimization. These methods require a perfect input representation and cannot be used on raw inputs as such.

In this work, we bring deep learning and combinatorial optimization together. Ideally, we would like to have the best of both worlds: having rich feature representations through deep neural networks and efficient algorithm implementations that enable combinatorial generalization. We have developed a method [] with which such algorithms can indeed be used as building blocks in deep neural networks. The original implementations do not need to be known or modified. Instead, it is enough to invoke them one time during the gradient computation on a modified input to obtain an informative gradient.

Equipped with our new method, called Blackbox differentiation, we tackled two real-world problems in computer vision: the keypoint correspondence problem and optimizing rank-based functions.
The keypoint correspondence problem is about finding matching pairs of points in two or more images. Our architecture embedding a strong graph-matching solver [], is able to outperform the state of the art in this domain on several benchmarks.
Another remarkable success of our method is that it allows computing gradients for ranking-based functions [] -- commonly used in, e.g., object-retrieval. Our paper was nominated for the best-paper award at CVPR.

Enhancing reinforcement learning agents with planning algorithms as part of their policy networks is another exciting application [].

Recently, we conceptually enhanced our method with the ability to learn the type of combinatorial algorithm that should be solved []. This is achieved by learning the constraints of an integer linear program.

Optimizing Rank-based Metrics

In search for practical application of the blackbox-differetiation theory, we turn to computer vision. Concretely, we show that applying blackbox-backprop to computer vision benchmarks in recall and Average Precision for retrieval and detection tasks consistently improves the underlying architectures’ performance.

The main component that enables this is the blackbox formulation of the argsort operation used for ranking making the use of blackbox-differentiation theory possible. We made a blog post describing the method, which we call RaMBO (Rank Metric Blackbox Optimization). Further information about the paper (including a short and long oral presented at CVPR 2020) can be found here.

Deep Graph Matching

Staying in the research area of computer vision, we tackle the problem of keypoint matching. Leveraging our blackbox-backprop framework, we carefully extract keypoint descriptors using visual and geometrical information and use a state-of-the-art solver to solve a bipartite graph matching problem. The resulting end-to-end trainable architecture set the state-of-the-art on challenging keypoint matching benchmarks when it was published at ECCV 2020.

Further information about the paper can be found here.

CombOptNet

While this project is not directly using the blackbox differentiation framework, its spirit is closely linked to the idea of using combinatorial solvers in deep neural networks. This paper asks the question, whether it is possible to design an architecture that is not taylored to one specific combinatorial problem as in the blackbox differentiation framework, but instead offers universal combinatorial expressivity.

We describe `CombOptNet´, which learns the underlying combinatorial problem that governs the training data, by learning the parameters of the full specification of a general Integer Linear Program. Specifically, we show that we can solve challenging tasks such as knapsack problems from natural language description, as well as visual combinatorial graph matching, without a priori specifying the underlying combinatorial problem.

Further information about the paper can be found here.

Members

Empirical Inference, Autonomous Learning

Georg Martius

Senior Research Scientist

Autonomous Learning

Marin Vlastelica Pogancic

Autonomous Learning

Anselm Paulus

Doctoral Researcher

Autonomous Learning

Michal Rolinek

Empirical Inference

Dominik Zietlow

Guest Scientist

Publications

Autonomous Learning Conference Paper CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints Paulus, A., Rolínek, M., Musil, V., Amos, B., Martius, G. In Proceedings of the 38th International Conference on Machine Learning, 139:8443-8453, Proceedings of Machine Learning Research, (Editors: Meila, Marina and Zhang, Tong), PMLR, The Thirty-eighth International Conference on Machine Learning (ICML), July 2021

Abstract ›

Bridging logical and algorithmic reasoning with modern machine learning techniques is a fundamental challenge with potentially transformative impact. On the algorithmic side, many NP-hard problems can be expressed as integer programs, in which the constraints play the role of their ``combinatorial specification.'' In this work, we aim to integrate integer programming solvers into neural network architectures as layers capable of learning both the cost terms and the constraints. The resulting end-to-end trainable architectures jointly extract features from raw data and solve a suitable (learned) combinatorial problem with state-of-the-art integer programming solvers. We demonstrate the potential of such layers with an extensive performance analysis on synthetic data and with a demonstration on a competitive computer vision keypoint matching benchmark.

Arxiv Code Pdf Spotlight video @ ICML 2021 Poster @ ICML 2021 URL BibTeX

Autonomous Learning Conference Paper Neuro-algorithmic Policies Enable Fast Combinatorial Generalization Vlastelica, M., Rolinek, M., Martius, G. In Proceedings of the 2021 International Conference on Machine Learning (ICML), The Thirty-eighth International Conference on Machine Learning (ICML), July 2021

Abstract ›

Although model-based and model-free approa\-ches to learning the control of systems have achieved impressive results on standard benchmarks, generalization to task variations is still lacking. Recent results suggest that generalization for standard architectures improves only after obtaining exhaustive amounts of data. We give evidence that generalization capabilities are in many cases bottlenecked by the inability to generalize on the combinatorial aspects of the problem. We show that, for a certain subclass of the MDP framework, this can be alleviated by a neuro-algorithmic policy architecture that embeds a time-dependent shortest path solver in a deep neural network. Trained end-to-end via blackbox-differentiation, this method leads to considerable improvement in generalization capabilities in the low-data regime.

arXiv Spotlight PDF BibTeX

Autonomous Learning Conference Paper Deep Graph Matching via Blackbox Differentiation of Combinatorial Solvers Rolínek, M., Swoboda, P., Zietlow, D., Paulus, A., Musil, V., Martius, G. In Computer Vision – ECCV 2020, 28:407-424, Lecture Notes in Computer Science, 12373, (Editors: Vedaldi, Andrea and Bischof, Horst and Brox, Thomas and Frahm, Jan-Michael), Springer, Cham, 16th European Conference on Computer Vision (ECCV 2020) , August 2020 (Published)

Abstract ›

Building on recent progress at the intersection of combinatorial optimization and deep learning, we propose an end-to-end trainable architecture for deep graph matching that contains unmodified combinatorial solvers. Using the presence of heavily optimized combinatorial solvers together with some improvements in architecture design, we advance state-of-the-art on deep graph matching benchmarks for keypoint correspondence. In addition, we highlight the conceptual advantages of incorporating solvers into deep learning architectures, such as the possibility of post-processing with a strong multi-graph matching solver or the indifference to changes in the training setting. Finally, we propose two new challenging experimental setups.

Code Arxiv Long Spotlight Short Spotlight pdf DOI BibTeX

Autonomous Learning Conference Paper Optimizing Rank-based Metrics with Blackbox Differentiation Rolínek, M., Musil, V., Paulus, A., Vlastelica, M., Michaelis, C., Martius, G. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), 7617 - 7627, IEEE, Piscataway, NJ, IEEE/CVF International Conference on Computer Vision and Pattern Recognition (CVPR 2020), June 2020, Best paper nomination (Published)

Abstract ›

Rank-based metrics are some of the most widely used criteria for performance evaluation of computer vision models. Despite years of effort, direct optimization for these metrics remains a challenge due to their non-differentiable and non-decomposable nature. We present an efficient, theoretically sound, and general method for differentiating rank-based metrics with mini-batch gradient descent. In addition, we address optimization instability and sparsity of the supervision signal that both arise from using rank-based metrics as optimization targets. Resulting losses based on recall and Average Precision are applied to image retrieval and object detection tasks. We obtain performance that is competitive with state-of-the-art on standard image retrieval datasets and consistently improve performance of near state-of-the-art object detectors.

Paper @ CVPR2020 Long Oral Short Oral Arxiv Code Pdf DOI URL BibTeX

Autonomous Learning Conference Paper Differentiation of Blackbox Combinatorial Solvers Vlastelica*, M., Paulus*, A., Musil, V., Martius, G., Rolínek, M. In International Conference on Learning Representations, ICLR’20, May 2020, *Equal Contribution Arxiv Code pdf URL BibTeX