Publications

DEPARTMENTS

Emperical Interference

Haptic Intelligence

Modern Magnetic Systems

Perceiving Systems

Physical Intelligence

Robotic Materials

Social Foundations of Computation


Research Groups

Autonomous Vision

Autonomous Learning

Bioinspired Autonomous Miniature Robots

Dynamic Locomotion

Embodied Vision

Human Aspects of Machine Learning

Intelligent Control Systems

Learning and Dynamical Systems

Locomotion in Biorobotic and Somatic Systems

Micro, Nano, and Molecular Systems

Movement Generation and Control

Neural Capture and Synthesis

Physics for Inference and Optimization

Organizational Leadership and Diversity

Probabilistic Learning Group


Topics

Robot Learning

Conference Paper

2022

Autonomous Learning

Robotics

AI

Career

Award


Empirical Inference Article The contributions of color to recognition memory for natural scenes Wichmann, F., Sharpe, L., Gegenfurtner, K. Journal of Experimental Psychology: Learning, Memory and Cognition, 28(3):509-520, May 2002
The authors used a recognition memory paradigm to assess the influence of color information on visual memory for images of natural scenes. Subjects performed 5-10% better for colored than for black-and-white images independent of exposure duration. Experiment 2 indicated little influence of contrast once the images were suprathreshold, and Experiment 3 revealed that performance worsened when images were presented in color and tested in black and white, or vice versa, leading to the conclusion that the surface property color is part of the memory representation. Experiments 4 and 5 exclude the possibility that the superior recognition memory for colored images results solely from attentional factors or saliency. Finally, the recognition memory advantage disappears for falsely colored images of natural scenes: The improvement in recognition memory depends on the color congruence of presented images with learned knowledge about the color gamut found within natural scenes. The results can be accounted for within a multiple memory systems framework.
PDF Web DOI BibTeX

Empirical Inference Poster Detection and discrimination in pink noise Wichmann, F., Henning, G. 5:100, 5. T{\"u}binger Wahrnehmungskonferenz (TWK 2002), February 2002
Much of our information about early spatial vision comes from detection experiments involving low-contrast stimuli, which are not, perhaps, particularly "natural" stimuli. Contrast discrimination experiments provide one way to explore the visual system's response to stimuli of higher contrast whilst keeping the number of unknown parameters comparatively small. We explored both detection and contrast discrimination performance with sinusoidal and "pulse-train" (or line) gratings. Both types of grating had a fundamental spatial frequency of 2.09-c/deg but the pulse-train, ideally, contains, in addition to its fundamental component, all the harmonics of the fundamental. Although the 2.09-c/deg pulse-train produced on our display was measured using a high-performance digital camera (Photometrics) and shown to contain at least 8 harmonics at equal contrast, it was no more detectable than its most detectable component; no benefit from having additional information at the harmonics was measurable. The addition of broadband 1-D "pink" noise made it about a factor of four more detectable than any of its components. However, in contrast-discrimination experiments, with an in-phase pedestal or masking grating of the same form and phase as the signal and 15% contrast, the noise did not improve the discrimination performance of the pulse train relative to that of its sinusoidal components. We discuss the implications of these observations for models of early vision in particular the implications for possible sources of internal noise.
Web BibTeX

Empirical Inference Article Training invariant support vector machines DeCoste, D., Schölkopf, B. Machine Learning, 46(1-3):161-190, January 2002
Practical experience has shown that in order to obtain the best possible performance, prior knowledge about invariances of a classification problem at hand ought to be incorporated into the training procedure. We describe and review all known methods for doing so in support vector machines, provide experimental results, and discuss their respective merits. One of the significant new results reported in this work is our recent achievement of the lowest reported test error on the well-known MNIST digit recognition benchmark task, with SVM training times that are also significantly faster than previous SVM methods.
PDF DOI BibTeX

Empirical Inference Technical Report A compression approach to support vector model selection von Luxburg, U., Bousquet, O., Schölkopf, B. (101), Max Planck Institute for Biological Cybernetics, 2002, see more detailed JMLR version
In this paper we investigate connections between statistical learning theory and data compression on the basis of support vector machine (SVM) model selection. Inspired by several generalization bounds we construct ``compression coefficients'' for SVMs, which measure the amount by which the training labels can be compressed by some classification hypothesis. The main idea is to relate the coding precision of this hypothesis to the width of the margin of the SVM. The compression coefficients connect well known quantities such as the radius-margin ratio R^2/rho^2, the eigenvalues of the kernel matrix and the number of support vectors. To test whether they are useful in practice we ran model selection experiments on several real world datasets. As a result we found that compression coefficients can fairly accurately predict the parameters for which the test error is minimized.
BibTeX

Empirical Inference Conference Paper A kernel approach for learning from almost orthogonal patterns Schölkopf, B., Weston, J., Eskin, E., Leslie, C., Noble, W. In 13th European Conference on Machine Learning (ECML 2002) and 6th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'2002), Helsinki, Principles of Data Mining and Knowledge Discovery, Lecture Notes in Computer Science, 2430/2431:511-528, Lecture Notes in Computer Science, (Editors: T Elomaa and H Mannila and H Toivonen), Springer, Berlin, Germany, 13th European Conference on Machine Learning (ECML 2002) and 6th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD'2002), 2002 PostScript DOI BibTeX

Empirical Inference Poster Application of Monte Carlo Methods to Psychometric Function Fitting Wichmann, F. Proceedings of the 33rd European Conference on Mathematical Psychology, 44, 2002
The psychometric function relates an observer's performance to an independent variable, usually some physical quantity of a stimulus in a psychophysical task. Here I describe methods to (1) fitting psychometric functions, (2) assessing goodness-of-fit, and (3) providing confidence intervals for the function's parameters and other estimates derived from them. First I describe a constrained maximum-likelihood method for parameter estimation. Using Monte-Carlo simulations I demonstrate that it is important to have a fitting method that takes stimulus-independent errors (or "lapses") into account. Second, a number of goodness-of-fit tests are introduced. Because psychophysical data sets are usually rather small I advocate the use of Monte Carlo resampling techniques that do not rely on asymptotic theory for goodness-of-fit assessment. Third, a parametric bootstrap is employed to estimate the variability of fitted parameters and derived quantities such as thresholds and slopes. I describe how the bootstrap bridging assumption, on which the validity of the procedure depends, can be tested without incurring too high a cost in computation time. Finally I describe how the methods can be extended to test hypotheses concerning the form and shape of several psychometric functions. Software describing the methods is available (http://www.bootstrap-software.com/psignifit/), as well as articles describing the methods in detail (Wichmann&Hill, Perception&Psychophysics, 2001a,b).
BibTeX

Empirical Inference Article Contrast discrimination with pulse-trains in pink noise Henning, G., Bird, C., Wichmann, F. Journal of the Optical Society of America A, 19(7):1259-1266, 2002
Detection performance was measured with sinusoidal and pulse-train gratings. Although the 2.09-c/deg pulse-train, or line gratings, contained at least 8 harmonics all at equal contrast, they were no more detectable than their most detectable component. The addition of broadband pink noise designed to equalize the detectability of the components of the pulse train made the pulse train about a factor of four more detectable than any of its components. However, in contrast-discrimination experiments, with a pedestal or masking grating of the same form and phase as the signal and 15% contrast, the noise did not affect the discrimination performance of the pulse train relative to that obtained with its sinusoidal components. We discuss the implications of these observations for models of early vision in particular the implications for possible sources of internal noise.
PDF BibTeX

Empirical Inference Article Contrast discrimination with sinusoidal gratings of different spatial frequency Bird, C., Henning, G., Wichmann, F. Journal of the Optical Society of America A, 19(7):1267-1273, 2002
The detectability of contrast increments was measured as a function of the contrast of a masking or “pedestal” grating at a number of different spatial frequencies ranging from 2 to 16 cycles per degree of visual angle. The pedestal grating always had the same orientation, spatial frequency and phase as the signal. The shape of the contrast increment threshold versus pedestal contrast (TvC) functions depend of the performance level used to define the “threshold,” but when both axes are normalized by the contrast corresponding to 75% correct detection at each frequency, the (TvC) functions at a given performance level are identical. Confidence intervals on the slope of the rising part of the TvC functions are so wide that it is not possible with our data to reject Weber’s Law.
PDF BibTeX

Empirical Inference Conference Paper Luminance Artifacts on CRT Displays Wichmann, F. In IEEE Visualization, 571-574, (Editors: Moorhead, R.; Gross, M.; Joy, K. I.), IEEE Visualization, 2002
Most visualization panels today are still built around cathode-ray tubes (CRTs), certainly on personal desktops at work and at home. Whilst capable of producing pleasing images for common applications ranging from email writing to TV and DVD presentation, it is as well to note that there are a number of nonlinear transformations between input (voltage) and output (luminance) which distort the digital and/or analogue images send to a CRT. Some of them are input-independent and hence easy to fix, e.g. gamma correction, but others, such as pixel interactions, depend on the content of the input stimulus and are thus harder to compensate for. CRT-induced image distortions cause problems not only in basic vision research but also for applications where image fidelity is critical, most notably in medicine (digitization of X-ray images for diagnostic purposes) and in forms of online commerce, such as the online sale of images, where the image must be reproduced on some output device which will not have the same transfer function as the customer's CRT. I will present measurements from a number of CRTs and illustrate how some of their shortcomings may be problematic for the aforementioned applications.
BibTeX

Empirical Inference Poster Optimal linear estimation of self-motion - a real-world test of a model of fly tangential neurons Franz, M. SAB 02 Workshop, Robotics as theoretical biology, 7th meeting of the International Society for Simulation of Adaptive Behaviour (SAB), (Editors: Prescott, T.; Webb, B.), 2002
The tangential neurons in the fly brain are sensitive to the typical optic flow patterns generated during self-motion (see example in Fig.1). We examine whether a simplified linear model of these neurons can be used to estimate self-motion from the optic flow. We present a theory for the construction of an optimal linear estimator incorporating prior knowledge both about the distance distribution of the environment, and about the noise and self-motion statistics of the sensor. The optimal estimator is tested on a gantry carrying an omnidirectional vision sensor that can be moved along three translational and one rotational degree of freedom. The experiments indicate that the proposed approach yields accurate results for rotation estimates, independently of the current translation and scene layout. Translation estimates, however, turned out to be sensitive to simultaneous rotation and to the particular distance distribution of the scene. The gantry experiments confirm that the receptive field organization of the tangential neurons allows them, as an ensemble, to extract self-motion from the optic flow.
PDF BibTeX

Empirical Inference Article Regularized principal manifolds Smola, A., Mika, S., Schölkopf, B., Williamson, R. Journal of Machine Learning Research, 1:179-209, June 2001
Many settings of unsupervised learning can be viewed as quantization problems - the minimization of the expected quantization error subject to some restrictions. This allows the use of tools such as regularization from the theory of (supervised) risk minimization for unsupervised learning. This setting turns out to be closely related to principal curves, the generative topographic map, and robust coding. We explore this connection in two ways: (1) we propose an algorithm for finding principal manifolds that can be regularized in a variety of ways; and (2) we derive uniform convergence bounds and hence bounds on the learning rates of the algorithm. In particular, we give bounds on the covering numbers which allows us to obtain nearly optimal learning rates for certain types of regularization operators. Experimental results demonstrate the feasibility of the approach.
PDF BibTeX

Empirical Inference Thesis Variationsverfahren zur Untersuchung von Grundzustandseigenschaften des Ein-Band Hubbard-Modells Eichhorn, J. Biologische Kybernetik, Technische Universität Dresden, Dresden/Germany, May 2001
Using different modifications of a new variational approach, statical groundstate properties of the one-band Hubbard model such as energy and staggered magnetisation are calculated. By taking into account additional fluctuations, the method ist gradually improved so that a very good description of the energy in one and two dimensions can be achieved. After a detailed discussion of the application in one dimension, extensions for two dimensions are introduced. By use of a modified version of the variational ansatz in particular a description of the quantum phase transition for the magnetisation should be possible.
PostScript BibTeX

Empirical Inference Article Markovian domain fingerprinting: statistical segmentation of protein sequences Bejerano, G., Seldin, Y., Margalit, H., Tishby, N. Bioinformatics, 17(10):927-934, 2001 PDF Web BibTeX

Empirical Inference Article The psychometric function: I. Fitting, sampling and goodness-of-fit Wichmann, F., Hill, N. Perception and Psychophysics, 63 (8):1293-1313, 2001
The psychometric function relates an observer'sperformance to an independent variable, usually some physical quantity of a stimulus in a psychophysical task. This paper, together with its companion paper (Wichmann & Hill, 2001), describes an integrated approach to (1) fitting psychometric functions, (2) assessing the goodness of fit, and (3) providing confidence intervals for the function'sparameters and other estimates derived from them, for the purposes of hypothesis testing. The present paper deals with the first two topics, describing a constrained maximum-likelihood method of parameter estimation and developing several goodness-of-fit tests. Using Monte Carlo simulations, we deal with two specific difficulties that arise when fitting functions to psychophysical data. First, we note that human observers are prone to stimulus-independent errors (or lapses ). We show that failure to account for this can lead to serious biases in estimates of the psychometric function'sparameters and illustrate how the problem may be overcome. Second, we note that psychophysical data sets are usually rather small by the standards required by most of the commonly applied statistical tests. We demonstrate the potential errors of applying traditional X^2 methods to psychophysical data and advocate use of Monte Carlo resampling techniques that do not rely on asymptotic theory. We have made available the software to implement our methods
PDF BibTeX

Empirical Inference Conference Paper Unsupervised Segmentation and Classification of Mixtures of Markovian Sources Seldin, Y., Bejerano, G., Tishby, N. In The 33rd Symposium on the Interface of Computing Science and Statistics (Interface 2001 - Frontiers in Data Mining and Bioinformatics), 1-15, 33rd Symposium on the Interface of Computing Science and Statistics (Interface 2001 - Frontiers in Data Mining and Bioinformatics), 2001
We describe a novel algorithm for unsupervised segmentation of sequences into alternating Variable Memory Markov sources, first presented in [SBT01]. The algorithm is based on competitive learning between Markov models, when implemented as Prediction Suffix Trees [RST96] using the MDL principle. By applying a model clustering procedure, based on rate distortion theory combined with deterministic annealing, we obtain a hierarchical segmentation of sequences between alternating Markov sources. The method is applied successfully to unsupervised segmentation of multilingual texts into languages where it is able to infer correctly both the number of languages and the language switching points. When applied to protein sequence families (results of the [BSMT01] work), we demonstrate the method‘s ability to identify biologically meaningful sub-sequences within the proteins, which correspond to signatures of important functional sub-units called domains. Our approach to proteins classification (through the obtained signatures) is shown to have both conceptual and practical advantages over the currently used methods.
PDF Web BibTeX

Empirical Inference Conference Paper Unsupervised Sequence Segmentation by a Mixture of Switching Variable Memory Markov Sources Seldin, Y., Bejerano, G., Tishby, N. In In the proceeding of the 18th International Conference on Machine Learning (ICML 2001), 513-520, 18th International Conference on Machine Learning (ICML 2001), 2001
We present a novel information theoretic algorithm for unsupervised segmentation of sequences into alternating Variable Memory Markov sources. The algorithm is based on competitive learning between Markov models, when implemented as Prediction Suffix Trees (Ron et al., 1996) using the MDL principle. By applying a model clustering procedure, based on rate distortion theory combined with deterministic annealing, we obtain a hierarchical segmentation of sequences between alternating Markov sources. The algorithm seems to be self regulated and automatically avoids over segmentation. The method is applied successfully to unsupervised segmentation of multilingual texts into languages where it is able to infer correctly both the number of languages and the language switching points. When applied to protein sequence families, we demonstrate the method‘s ability to identify biologically meaningful sub-sequences within the proteins, which correspond to important functional sub-units called domains.
PDF BibTeX

Empirical Inference Book Advances in Large Margin Classifiers Smola, A., Bartlett, P., Schölkopf, B., Schuurmans, D. 422, Neural Information Processing, MIT Press, Cambridge, MA, USA, October 2000
The concept of large margins is a unifying principle for the analysis of many different approaches to the classification of data from examples, including boosting, mathematical programming, neural networks, and support vector machines. The fact that it is the margin, or confidence level, of a classification--that is, a scale parameter--rather than a raw training error that matters has become a key tool for dealing with classifiers. This book shows how this idea applies to both the theoretical analysis and the design of algorithms. The book provides an overview of recent developments in large margin classifiers, examines connections with other methods (e.g., Bayesian inference), and identifies strengths and weaknesses of the method, as well as directions for future research. Among the contributors are Manfred Opper, Vladimir Vapnik, and Grace Wahba.
Web BibTeX

Empirical Inference Book Chapter An Introduction to Kernel-Based Learning Algorithms Müller, K., Mika, S., Rätsch, G., Tsuda, K., Schölkopf, B. In Handbook of Neural Network Signal Processing, 4, (Editors: Yu Hen Hu and Jang-Neng Hwang), CRC Press, 2000 (Published) BibTeX

Empirical Inference Conference Paper Choosing nu in support vector regression with different noise models — theory and experiments Chalimourda, A., Schölkopf, B., Smola, A. In International Joint Conference on Neural Networks, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, IJCNN 2000, Neural Computing: New Challenges and Perspectives for the New Millennium, IEEE, International Joint Conference on Neural Networks, 2000 BibTeX

Empirical Inference Conference Paper Engineering Support Vector Machine Kernels That Recognize Translation Initiation Sites in DNA Zien, A., Rätsch, G., Mika, S., Schölkopf, B., Lemmen, C., Smola, A., Lengauer, T., Müller, K. In German Conference on Bioinformatics (GCB 1999), October 1999 (Published)
In order to extract protein sequences from nucleotide sequences, it is an important step to recognize points from which regions encoding pro­ teins start, the so­called translation initiation sites (TIS). This can be modeled as a classification prob­ lem. We demonstrate the power of support vector machines (SVMs) for this task, and show how to suc­ cessfully incorporate biological prior knowledge by engineering an appropriate kernel function.
Web BibTeX

Empirical Inference Article SVMs — a practical consequence of learning theory Schölkopf, B. IEEE Intelligent Systems and their Applications, 13(4):18-21, July 1998
My first exposure to Support Vector Machines came this spring when heard Sue Dumais present impressive results on text categorization using this analysis technique. This issue's collection of essays should help familiarize our readers with this interesting new racehorse in the Machine Learning stable. Bernhard Scholkopf, in an introductory overview, points out that a particular advantage of SVMs over other learning algorithms is that it can be analyzed theoretically using concepts from computational learning theory, and at the same time can achieve good performance when applied to real problems. Examples of these real-world applications are provided by Sue Dumais, who describes the aforementioned text-categorization problem, yielding the best results to date on the Reuters collection, and Edgar Osuna, who presents strong results on application to face detection. Our fourth author, John Platt, gives us a practical guide and a new technique for implementing the algorithm efficiently.
PDF Web DOI BibTeX

Empirical Inference Conference Paper From regularization operators to support vector kernels Smola, A., Schölkopf, B. In Advances in Neural Information Processing Systems, Advances in Neural Information Processing Systems 10, 343-349, (Editors: M Jordan and M Kearns and S Solla), MIT Press, Cambridge, MA, USA, 11th Annual Conference on Neural Information Processing (NIPS 1997), June 1998 PDF Web BibTeX

Empirical Inference Conference Paper Prior knowledge in support vector kernels Schölkopf, B., Simard, P., Smola, A., Vapnik, V. In Advances in Neural Information Processing Systems, Advances in Neural Information Processing Systems 10, 640-646 , (Editors: M Jordan and M Kearns and S Solla ), MIT Press, Cambridge, MA, USA, Eleventh Annual Conference on Neural Information Processing (NIPS 1997), June 1998 PDF Web BibTeX

Empirical Inference Article Learning view graphs for robot navigation Franz, M., Schölkopf, B., Mallot, H., Bülthoff, H. A. Autonomous Robots, 5(1):111-125, March 1998
We present a purely vision-based scheme for learning a topological representation of an open environment. The system represents selected places by local views of the surrounding scene, and finds traversable paths between them. The set of recorded views and their connections are combined into a graph model of the environment. To navigate between views connected in the graph, we employ a homing strategy inspired by findings of insect ethology. In robot experiments, we demonstrate that complex visual exploration and navigation tasks can thus be performed without using metric information.
PDF PDF DOI BibTeX

Empirical Inference Conference Paper Incorporating invariances in support vector learning machines Schölkopf, B., Burges, C., Vapnik, V. In Artificial Neural Networks --- ICANN‘96, Artificial Neural Networks: ICANN 96, LNCS vol. 1112, 47-52, (Editors: C von der Malsburg and W von Seelen and JC Vorbrüggen and B Sendhoff), Springer, Berlin, Germany, 6th International Conference on Artificial Neural Networks, July 1996, volume 1112 of Lecture Notes in Computer Science
Developed only recently, support vector learning machines achieve high generalization ability by minimizing a bound on the expected test error; however, so far there existed no way of adding knowledge about invariances of a classification problem at hand. We present a method of incorporating prior knowledge about transformation invariances by applying transformations to support vectors, the training examples most critical for determining the classification boundary.
PDF DOI BibTeX