Header logo is


2007


no image
3D Reconstruction of Neural Circuits from Serial EM Images

Maack, N., Kapfer, C., Macke, J., Schölkopf, B., Denk, W., Borst, A.

31st G{\"o}ttingen Neurobiology Conference, 31, pages: 1195, March 2007 (poster)

ei

PDF [BibTex]

2007


PDF [BibTex]


no image
Transductive Classification via Local Learning Regularization

Wu, M., Schölkopf, B.

In JMLR Workshop and Conference Proceedings Volume 2: AISTATS 2007, pages: 628-635, (Editors: M Meila and X Shen), 11th International Conference on Artificial Intelligence and Statistics, March 2007 (inproceedings)

Abstract
The idea of local learning, classifying a particular point based on its neighbors, has been successfully applied to supervised learning problems. In this paper, we adapt it for Transductive Classification (TC) problems. Specifically, we formulate a Local Learning Regularizer (LL-Reg) which leads to a solution with the property that the label of each data point can be well predicted based on its neighbors and their labels. For model selection, an efficient way to compute the leave-one-out classification error is provided for the proposed and related algorithms. Experimental results using several benchmark datasets illustrate the effectiveness of the proposed approach.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Identifying temporal population codes in the retina using canonical correlation analysis

Bethge, M., Macke, J., Gerwinn, S., Zeck, G.

31st G{\"o}ttingen Neurobiology Conference, 31, pages: 359, March 2007 (poster)

ei

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
Bayesian Neural System identification: error bars, receptive fields and neural couplings

Gerwinn, S., Seeger, M., Zeck, G., Bethge, M.

31st G{\"o}ttingen Neurobiology Conference, 31, pages: 360, March 2007 (poster)

ei

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
Applications of Kernel Machines to Structured Data

Eichhorn, J.

Biologische Kybernetik, Technische Universität Berlin, Berlin, Germany, March 2007, passed with "sehr gut", published online (phdthesis)

ei

PDF [BibTex]

PDF [BibTex]


no image
A priori Knowledge from Non-Examples

Sinz, FH.

Biologische Kybernetik, Eberhard-Karls-Universität Tübingen, Tübingen, Germany, March 2007 (diplomathesis)

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Improving the Caenorhabditis elegans Genome Annotation Using Machine Learning

Rätsch, G., Sonnenburg, S., Srinivasan, J., Witte, H., Müller, K., Sommer, R., Schölkopf, B.

PLoS Computational Biology, 3(2, e20):0313-0322, February 2007 (article)

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
The Independent Components of Natural Images are Perceptually Dependent

Bethge, M., Wiecki, T., Wichmann, F.

In Human Vision and Electronic Imaging XII, pages: 1-12, (Editors: Rogowitz, B. E.), SPIE, Bellingham, WA, USA, SPIE Human Vision and Electronic Imaging Conference, February 2007 (inproceedings)

Abstract
The independent components of natural images are a set of linear filters which are optimized for statistical independence. With such a set of filters images can be represented without loss of information. Intriguingly, the filter shapes are localized, oriented, and bandpass, resembling important properties of V1 simple cell receptive fields. Here we address the question of whether the independent components of natural images are also perceptually less dependent than other image components. We compared the pixel basis, the ICA basis and the discrete cosine basis by asking subjects to interactively predict missing pixels (for the pixel basis) or to predict the coefficients of ICA and DCT basis functions in patches of natural images. Like Kersten (1987) we find the pixel basis to be perceptually highly redundant but perhaps surprisingly, the ICA basis showed significantly higher perceptual dependencies than the DCT basis. This shows a dissociation between statistical and perceptual dependence measures.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
About the Triangle Inequality in Perceptual Spaces

Jäkel, F., Schölkopf, B., Wichmann, F.

Proceedings of the Computational and Systems Neuroscience Meeting 2007 (COSYNE), 4, pages: 308, February 2007 (poster)

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Automatic 3D Face Reconstruction from Single Images or Video

Breuer, P., Kim, K., Kienzle, W., Blanz, V., Schölkopf, B.

(160), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, February 2007 (techreport)

Abstract
This paper presents a fully automated algorithm for reconstructing a textured 3D model of a face from a single photograph or a raw video stream. The algorithm is based on a combination of Support Vector Machines (SVMs) and a Morphable Model of 3D faces. After SVM face detection, individual facial features are detected using a novel regression-and classification-based approach, and probabilistically plausible configurations of features are selected to produce a list of candidates for several facial feature positions. In the next step, the configurations of feature points are evaluated using a novel criterion that is based on a Morphable Model and a combination of linear projections. Finally, the feature points initialize a model-fitting procedure of the Morphable Model. The result is a high-resolution 3D surface model.

ei

PDF [BibTex]

PDF [BibTex]


no image
Center-surround filters emerge from optimizing predictivity in a free-viewing task

Kienzle, W., Wichmann, F., Schölkopf, B., Franz, M.

Proceedings of the Computational and Systems Neuroscience Meeting 2007 (COSYNE), 4, pages: 207, February 2007 (poster)

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Nonlinear Receptive Field Analysis: Making Kernel Methods Interpretable

Kienzle, W., Macke, J., Wichmann, F., Schölkopf, B., Franz, M.

Computational and Systems Neuroscience Meeting 2007 (COSYNE 2007), 4, pages: 16, February 2007 (poster)

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Statistical Consistency of Kernel Canonical Correlation Analysis

Fukumizu, K., Bach, F., Gretton, A.

Journal of Machine Learning Research, 8, pages: 361-383, February 2007 (article)

Abstract
While kernel canonical correlation analysis (CCA) has been applied in many contexts, the convergence of finite sample estimates of the associated functions to their population counterparts has not yet been established. This paper gives a mathematical proof of the statistical convergence of kernel CCA, providing a theoretical justification for the method. The proof uses covariance operators defined on reproducing kernel Hilbert spaces, and analyzes the convergence of their empirical estimates of finite rank to their population counterparts, which can have infinite rank. The result also gives a sufficient condition for convergence on the regularization coefficient involved in kernel CCA: this should decrease as n^{-1/3}, where n is the number of data.

ei

PDF [BibTex]

PDF [BibTex]


no image
Unsupervised learning of a steerable basis for invariant image representations

Bethge, M., Gerwinn, S., Macke, J.

In Human Vision and Electronic Imaging XII, pages: 1-12, (Editors: Rogowitz, B. E.), SPIE, Bellingham, WA, USA, SPIE Human Vision and Electronic Imaging Conference, February 2007 (inproceedings)

Abstract
There are two aspects to unsupervised learning of invariant representations of images: First, we can reduce the dimensionality of the representation by finding an optimal trade-off between temporal stability and informativeness. We show that the answer to this optimization problem is generally not unique so that there is still considerable freedom in choosing a suitable basis. Which of the many optimal representations should be selected? Here, we focus on this second aspect, and seek to find representations that are invariant under geometrical transformations occuring in sequences of natural images. We utilize ideas of steerability and Lie groups, which have been developed in the context of filter design. In particular, we show how an anti-symmetric version of canonical correlation analysis can be used to learn a full-rank image basis which is steerable with respect to rotations. We provide a geometric interpretation of this algorithm by showing that it finds the two-dimensional eigensubspaces of the avera ge bivector. For data which exhibits a variety of transformations, we develop a bivector clustering algorithm, which we use to learn a basis of generalized quadrature pairs (i.e. complex cells) from sequences of natural images.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Estimating Population Receptive Fields in Space and Time

Macke, J., Zeck, G., Bethge, M.

Computational and Systems Neuroscience Meeting 2007 (COSYNE 2007), 4, pages: 44, February 2007 (poster)

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Machine Learning for Mass Production and Industrial Engineering

Pfingsten, T.

Biologische Kybernetik, Eberhard-Karls-Universität Tübingen, Tübingen, Germany, February 2007 (phdthesis)

ei

PDF [BibTex]

PDF [BibTex]


no image
New Margin- and Evidence-Based Approaches for EEG Signal Classification

Hill, N., Farquhar, J.

Invited talk at the FaSor Jahressymposium, February 2007 (talk)

ei

PDF [BibTex]

PDF [BibTex]


no image
On the Pre-Image Problem in Kernel Methods

BakIr, G., Schölkopf, B., Weston, J.

In Kernel Methods in Bioengineering, Signal and Image Processing, pages: 284-302, (Editors: G Camps-Valls and JL Rojo-Álvarez and M Martínez-Ramón), Idea Group Publishing, Hershey, PA, USA, January 2007 (inbook)

Abstract
In this chapter we are concerned with the problem of reconstructing patterns from their representation in feature space, known as the pre-image problem. We review existing algorithms and propose a learning based approach. All algorithms are discussed regarding their usability and complexity and evaluated on an image denoising application.

ei

DOI [BibTex]

DOI [BibTex]


no image
A Subspace Kernel for Nonlinear Feature Extraction

Wu, M., Farquhar, J.

In IJCAI-07, pages: 1125-1130, (Editors: Veloso, M. M.), AAAI Press, Menlo Park, CA, USA, International Joint Conference on Artificial Intelligence, January 2007 (inproceedings)

Abstract
Kernel based nonlinear Feature Extraction (KFE) or dimensionality reduction is a widely used pre-processing step in pattern classification and data mining tasks. Given a positive definite kernel function, it is well known that the input data are implicitly mapped to a feature space with usually very high dimensionality. The goal of KFE is to find a low dimensional subspace of this feature space, which retains most of the information needed for classification or data analysis. In this paper, we propose a subspace kernel based on which the feature extraction problem is transformed to a kernel parameter learning problem. The key observation is that when projecting data into a low dimensional subspace of the feature space, the parameters that are used for describing this subspace can be regarded as the parameters of the kernel function between the projected data. Therefore current kernel parameter learning methods can be adapted to optimize this parameterized kernel function. Experimental results are provided to validate the effectiveness of the proposed approach.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Some observations on the pedestal effect

Henning, G., Wichmann, F.

Journal of Vision, 7(1:3):1-15, January 2007 (article)

Abstract
The pedestal or dipper effect is the large improvement in the detectability of a sinusoidal grating observed when it is added to a masking or pedestal grating of the same spatial frequency, orientation, and phase. We measured the pedestal effect in both broadband and notched noiseVnoise from which a 1.5-octave band centered on the signal frequency had been removed. Although the pedestal effect persists in broadband noise, it almost disappears in the notched noise. Furthermore, the pedestal effect is substantial when either high- or low-pass masking noise is used. We conclude that the pedestal effect in the absence of notched noise results principally from the use of information derived from channels with peak sensitivities at spatial frequencies different from that of the signal and the pedestal. We speculate that the spatial-frequency components of the notched noise above and below the spatial frequency of the signal and the pedestal prevent ‘‘off-frequency looking,’’ that is, prevent the use of information about changes in contrast carried in channels tuned to spatial frequencies that are very much different from that of the signal and the pedestal. Thus, the pedestal or dipper effect measured without notched noise appears not to be a characteristic of individual spatial-frequency-tuned channels.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Development of a Brain-Computer Interface Approach Based on Covert Attention to Tactile Stimuli

Raths, C.

University of Tübingen, Germany, University of Tübingen, Germany, January 2007 (diplomathesis)

ei

[BibTex]

[BibTex]


no image
Cue Combination and the Effect of Horizontal Disparity and Perspective on Stereoacuity

Zalevski, AM., Henning, GB., Hill, NJ.

Spatial Vision, 20(1):107-138, January 2007 (article)

Abstract
Relative depth judgments of vertical lines based on horizontal disparity deteriorate enormously when the lines form part of closed configurations (Westheimer, 1979). In studies showing this effect, perspective was not manipulated and thus produced inconsistency between horizontal disparity and perspective. We show that stereoacuity improves dramatically when perspective and horizontal disparity are made consistent. Observers appear to use unhelpful perspective cues in judging the relative depth of the vertical sides of rectangles in a way not incompatible with a form of cue weighting. However, 95% confidence intervals for the weights derived for cues usually exceed the a-priori [0-1] range.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
A Machine Learning Approach for Estimating the Attenuation Map for a Combined PET/MR Scanner

Hofmann, M.

Biologische Kybernetik, Max-Planck Institute for Biological Cybernetics, Tübingen, Germany, 2007 (diplomathesis)

ei

[BibTex]

[BibTex]


no image
Mathematik der Wahrnehmung: Wendepunkte

Wichman, F., Ernst, MO.

Akademische Mitteilungen zw{\"o}lf: F{\"u}nf Sinne, pages: 32-37, 2007 (misc)

ei

[BibTex]

[BibTex]


no image
Towards Machine Learning of Motor Skills

Peters, J., Schaal, S., Schölkopf, B.

In Proceedings of Autonome Mobile Systeme (AMS), pages: 138-144, (Editors: K Berns and T Luksch), 2007, clmc (inproceedings)

Abstract
Autonomous robots that can adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and cognitive sciences. Early approaches to this goal during the heydays of artificial intelligence research in the late 1980s, however, made it clear that an approach purely based on reasoning or human insights would not be able to model all the perceptuomotor tasks that a robot should fulfill. Instead, new hope was put in the growing wake of machine learning that promised fully adaptive control algorithms which learn both by observation and trial-and-error. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general approach to motor skill learning in order to get one step closer towards human-like performance. For doing so, we study two ma jor components for such an approach, i.e., firstly, a theoretically well-founded general approach to representing the required control structures for task representation and execution and, secondly, appropriate learning algorithms which can be applied in this setting.

am ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Reinforcement Learning for Optimal Control of Arm Movements

Theodorou, E., Peters, J., Schaal, S.

In Abstracts of the 37st Meeting of the Society of Neuroscience., Neuroscience, 2007, clmc (inproceedings)

Abstract
Every day motor behavior consists of a plethora of challenging motor skills from discrete movements such as reaching and throwing to rhythmic movements such as walking, drumming and running. How this plethora of motor skills can be learned remains an open question. In particular, is there any unifying computa-tional framework that could model the learning process of this variety of motor behaviors and at the same time be biologically plausible? In this work we aim to give an answer to these questions by providing a computational framework that unifies the learning mechanism of both rhythmic and discrete movements under optimization criteria, i.e., in a non-supervised trial-and-error fashion. Our suggested framework is based on Reinforcement Learning, which is mostly considered as too costly to be a plausible mechanism for learning com-plex limb movement. However, recent work on reinforcement learning with pol-icy gradients combined with parameterized movement primitives allows novel and more efficient algorithms. By using the representational power of such mo-tor primitives we show how rhythmic motor behaviors such as walking, squash-ing and drumming as well as discrete behaviors like reaching and grasping can be learned with biologically plausible algorithms. Using extensive simulations and by using different reward functions we provide results that support the hy-pothesis that Reinforcement Learning could be a viable candidate for motor learning of human motor behavior when other learning methods like supervised learning are not feasible.

am ei

[BibTex]

[BibTex]


no image
Reinforcement learning by reward-weighted regression for operational space control

Peters, J., Schaal, S.

In Proceedings of the 24th Annual International Conference on Machine Learning, pages: 745-750, ICML, 2007, clmc (inproceedings)

Abstract
Many robot control problems of practical importance, including operational space control, can be reformulated as immediate reward reinforcement learning problems. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-base reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots.

am ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Policy gradient methods for machine learning

Peters, J., Theodorou, E., Schaal, S.

In Proceedings of the 14th INFORMS Conference of the Applied Probability Society, pages: 97-98, Eindhoven, Netherlands, July 9-11, 2007, 2007, clmc (inproceedings)

Abstract
We present an in-depth survey of policy gradient methods as they are used in the machine learning community for optimizing parameterized, stochastic control policies in Markovian systems with respect to the expected reward. Despite having been developed separately in the reinforcement learning literature, policy gradient methods employ likelihood ratio gradient estimators as also suggested in the stochastic simulation optimization community. It is well-known that this approach to policy gradient estimation traditionally suffers from three drawbacks, i.e., large variance, a strong dependence on baseline functions and a inefficient gradient descent. In this talk, we will present a series of recent results which tackles each of these problems. The variance of the gradient estimation can be reduced significantly through recently introduced techniques such as optimal baselines, compatible function approximations and all-action gradients. However, as even the analytically obtainable policy gradients perform unnaturally slow, it required the step from ÔvanillaÕ policy gradient methods towards natural policy gradients in order to overcome the inefficiency of the gradient descent. This development resulted into the Natural Actor-Critic architecture which can be shown to be very efficient in application to motor primitive learning for robotics.

am ei

[BibTex]

[BibTex]


no image
Policy Learning for Motor Skills

Peters, J., Schaal, S.

In Proceedings of 14th International Conference on Neural Information Processing (ICONIP), pages: 233-242, (Editors: Ishikawa, M. , K. Doya, H. Miyamoto, T. Yamakawa), 2007, clmc (inproceedings)

Abstract
Policy learning which allows autonomous robots to adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and cognitive sciences. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics, and usually scaling was only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general approach policy learning with the goal of an application to motor skill refinement in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i.e., firstly, we study policy learning algorithms which can be applied in the general setting of motor skill learning, and, secondly, we study a theoretically well-founded general approach to representing the required control structures for task representation and execution.

am ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Reinforcement learning for operational space control

Peters, J., Schaal, S.

In Proceedings of the 2007 IEEE International Conference on Robotics and Automation, pages: 2111-2116, IEEE Computer Society, ICRA, 2007, clmc (inproceedings)

Abstract
While operational space control is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in complex robots, e.g., humanoid robots. In such cases, learning control methods can offer an interesting alternative to analytical control algorithms. However, the resulting supervised learning problem is ill-defined as it requires to learn an inverse mapping of a usually redundant system, which is well known to suffer from the property of non-convexity of the solution space, i.e., the learning system could generate motor commands that try to steer the robot into physically impossible configurations. The important insight that many operational space control algorithms can be reformulated as optimal control problems, however, allows addressing this inverse learning problem in the framework of reinforcement learning. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-based reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots.

am ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Relative Entropy Policy Search

Peters, J.

CLMC Technical Report: TR-CLMC-2007-2, Computational Learning and Motor Control Lab, Los Angeles, CA, 2007, clmc (techreport)

Abstract
This technical report describes a cute idea of how to create new policy search approaches. It directly relates to the Natural Actor-Critic methods but allows the derivation of one shot solutions. Future work may include the application to interesting problems.

am ei

PDF link (url) [BibTex]

PDF link (url) [BibTex]


no image
Using reward-weighted regression for reinforcement learning of task space control

Peters, J., Schaal, S.

In Proceedings of the 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pages: 262-267, Honolulu, Hawaii, April 1-5, 2007, 2007, clmc (inproceedings)

Abstract
In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, `vanilla' policy gradients and natural policy gradients. Each of these methods is first presented in its simple form and subsequently refined and optimized. By carrying out numerous experiments on the cart pole regulator benchmark we aim to provide a useful baseline for future research on parameterized policy search algorithms. Portable C++ code is provided for both plant and algorithms; thus, the results in this paper can be reevaluated, reused and new algorithms can be inserted with ease.

am ei

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark

Riedmiller, M., Peters, J., Schaal, S.

In Proceedings of the 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pages: 254-261, ADPRL, 2007, clmc (inproceedings)

Abstract
In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, `vanilla' policy gradients and natural policy gradients. Each of these methods is first presented in its simple form and subsequently refined and optimized. By carrying out numerous experiments on the cart pole regulator benchmark we aim to provide a useful baseline for future research on parameterized policy search algorithms. Portable C++ code is provided for both plant and algorithms; thus, the results in this paper can be reevaluated, reused and new algorithms can be inserted with ease.

am ei

PDF [BibTex]

PDF [BibTex]

2005


no image
Kernel Methods for Measuring Independence

Gretton, A., Herbrich, R., Smola, A., Bousquet, O., Schölkopf, B.

Journal of Machine Learning Research, 6, pages: 2075-2129, December 2005 (article)

Abstract
We introduce two new functionals, the constrained covariance and the kernel mutual information, to measure the degree of independence of random variables. These quantities are both based on the covariance between functions of the random variables in reproducing kernel Hilbert spaces (RKHSs). We prove that when the RKHSs are universal, both functionals are zero if and only if the random variables are pairwise independent. We also show that the kernel mutual information is an upper bound near independence on the Parzen window estimate of the mutual information. Analogous results apply for two correlation-based dependence functionals introduced earlier: we show the kernel canonical correlation and the kernel generalised variance to be independence measures for universal kernels, and prove the latter to be an upper bound on the mutual information near independence. The performance of the kernel dependence functionals in measuring independence is verified in the context of independent component analysis.

ei

PDF PostScript PDF [BibTex]

2005


PDF PostScript PDF [BibTex]


no image
Kernel ICA for Large Scale Problems

Jegelka, S., Gretton, A., Achlioptas, D.

In pages: -, NIPS Workshop on Large Scale Kernel Machines, December 2005 (inproceedings)

ei

Web [BibTex]

Web [BibTex]


no image
Some thoughts about Gaussian Processes

Chapelle, O.

NIPS Workshop on Open Problems in Gaussian Processes for Machine Learning, December 2005 (talk)

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
A Unifying View of Sparse Approximate Gaussian Process Regression

Quinonero Candela, J., Rasmussen, C.

Journal of Machine Learning Research, 6, pages: 1935-1959, December 2005 (article)

Abstract
We provide a new unifying view, including all existing proper probabilistic sparse approximations for Gaussian process regression. Our approach relies on expressing the effective prior which the methods are using. This allows new insights to be gained, and highlights the relationship between existing methods. It also allows for a clear theoretically justified ranking of the closeness of the known approximations to the corresponding full GPs. Finally we point directly to designs of new better sparse approximations, combining the best of the existing strategies, within attractive computational constraints.

ei

PDF [BibTex]

PDF [BibTex]


no image
Popper, Falsification and the VC-dimension

Corfield, D., Schölkopf, B., Vapnik, V.

(145), Max Planck Institute for Biological Cybernetics, November 2005 (techreport)

ei

PDF [BibTex]

PDF [BibTex]


no image
Extension to Kernel Dependency Estimation with Applications to Robotics

BakIr, G.

Biologische Kybernetik, Technische Universität Berlin, Berlin, November 2005 (phdthesis)

Abstract
Kernel Dependency Estimation(KDE) is a novel technique which was designed to learn mappings between sets without making assumptions on the type of the involved input and output data. It learns the mapping in two stages. In a first step, it tries to estimate coordinates of a feature space representation of elements of the set by solving a high dimensional multivariate regression problem in feature space. Following this, it tries to reconstruct the original representation given the estimated coordinates. This thesis introduces various algorithmic extensions to both stages in KDE. One of the contributions of this thesis is to propose a novel linear regression algorithm that explores low-dimensional subspaces during learning. Furthermore various existing strategies for reconstructing patterns from feature maps involved in KDE are discussed and novel pre-image techniques are introduced. In particular, pre-image techniques for data-types that are of discrete nature such as graphs and strings are investigated. KDE is then explored in the context of robot pose imitation where the input is a an image with a human operator and the output is the robot articulated variables. Thus, using KDE, robot pose imitation is formulated as a regression problem.

ei

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
Kernel methods for dependence testing in LFP-MUA

Gretton, A., Belitski, A., Murayama, Y., Schölkopf, B., Logothetis, N.

35(689.17), 35th Annual Meeting of the Society for Neuroscience (Neuroscience), November 2005 (poster)

Abstract
A fundamental problem in neuroscience is determining whether or not particular neural signals are dependent. The correlation is the most straightforward basis for such tests, but considerable work also focuses on the mutual information (MI), which is capable of revealing dependence of higher orders that the correlation cannot detect. That said, there are other measures of dependence that share with the MI an ability to detect dependence of any order, but which can be easier to compute in practice. We focus in particular on tests based on the functional covariance, which derive from work originally accomplished in 1959 by Renyi. Conceptually, our dependence tests work by computing the covariance between (infinite dimensional) vectors of nonlinear mappings of the observations being tested, and then determining whether this covariance is zero - we call this measure the constrained covariance (COCO). When these vectors are members of universal reproducing kernel Hilbert spaces, we can prove this covariance to be zero only when the variables being tested are independent. The greatest advantage of these tests, compared with the mutual information, is their simplicity – when comparing two signals, we need only take the largest eigenvalue (or the trace) of a product of two matrices of nonlinearities, where these matrices are generally much smaller than the number of observations (and are very simple to construct). We compare the mutual information, the COCO, and the correlation in the context of finding changes in dependence between the LFP and MUA signals in the primary visual cortex of the anaesthetized macaque, during the presentation of dynamic natural stimuli. We demonstrate that the MI and COCO reveal dependence which is not detected by the correlation alone (which we prove by artificially removing all correlation between the signals, and then testing their dependence with COCO and the MI); and that COCO and the MI give results consistent with each other on our data.

ei

Web [BibTex]

Web [BibTex]


no image
Training Support Vector Machines with Multiple Equality Constraints

Kienzle, W., Schölkopf, B.

In Proceedings of the 16th European Conference on Machine Learning, Lecture Notes in Computer Science, Vol. 3720, pages: 182-193, (Editors: JG Carbonell and J Siekmann), Springer, Berlin, Germany, ECML, November 2005 (inproceedings)

Abstract
In this paper we present a primal-dual decomposition algorithm for support vector machine training. As with existing methods that use very small working sets (such as Sequential Minimal Optimization (SMO), Successive Over-Relaxation (SOR) or the Kernel Adatron (KA)), our method scales well, is straightforward to implement, and does not require an external QP solver. Unlike SMO, SOR and KA, the method is applicable to a large number of SVM formulations regardless of the number of equality constraints involved. The effectiveness of our algorithm is demonstrated on a more difficult SVM variant in this respect, namely semi-parametric support vector regression.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Geometrical aspects of statistical learning theory

Hein, M.

Biologische Kybernetik, Darmstadt, Darmstadt, November 2005 (phdthesis)

ei

PDF [BibTex]

PDF [BibTex]


no image
Measuring Statistical Dependence with Hilbert-Schmidt Norms

Gretton, A., Bousquet, O., Smola, A., Schoelkopf, B.

In Algorithmic Learning Theory, Lecture Notes in Computer Science, Vol. 3734, pages: 63-78, (Editors: S Jain and H-U Simon and E Tomita), Springer, Berlin, Germany, 16th International Conference ALT, October 2005 (inproceedings)

Abstract
We propose an independence criterion based on the eigenspectrum of covariance operators in reproducing kernel Hilbert spaces (RKHSs), consisting of an empirical estimate of the Hilbert-Schmidt norm of the cross-covariance operator (we term this a Hilbert-Schmidt Independence Criterion, or HSIC). This approach has several advantages, compared with previous kernel-based independence criteria. First, the empirical estimate is simpler than any other kernel dependence test, and requires no user-defined regularisation. Second, there is a clearly defined population quantity which the empirical estimate approaches in the large sample limit, with exponential convergence guaranteed between the two: this ensures that independence tests based on {methodname} do not suffer from slow learning rates. Finally, we show in the context of independent component analysis (ICA) that the performance of HSIC is competitive with that of previously published kernel-based criteria, and of other recently published ICA methods.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Maximal Margin Classification for Metric Spaces

Hein, M., Bousquet, O., Schölkopf, B.

Journal of Computer and System Sciences, 71(3):333-359, October 2005 (article)

Abstract
In order to apply the maximum margin method in arbitrary metric spaces, we suggest to embed the metric space into a Banach or Hilbert space and to perform linear classification in this space. We propose several embeddings and recall that an isometric embedding in a Banach space is always possible while an isometric embedding in a Hilbert space is only possible for certain metric spaces. As a result, we obtain a general maximum margin classification algorithm for arbitrary metric spaces (whose solution is approximated by an algorithm of Graepel. Interestingly enough, the embedding approach, when applied to a metric which can be embedded into a Hilbert space, yields the SVM algorithm, which emphasizes the fact that its solution depends on the metric and not on the kernel. Furthermore we give upper bounds of the capacity of the function classes corresponding to both embeddings in terms of Rademacher averages. Finally we compare the capacities of these function classes directly.

ei

PDF PDF DOI [BibTex]

PDF PDF DOI [BibTex]


no image
An Analysis of the Anti-Learning Phenomenon for the Class Symmetric Polyhedron

Kowalczyk, A., Chapelle, O.

In Algorithmic Learning Theory: 16th International Conference, pages: 78-92, Algorithmic Learning Theory, October 2005 (inproceedings)

Abstract
This paper deals with an unusual phenomenon where most machine learning algorithms yield good performance on the training set but systematically worse than random performance on the test set. This has been observed so far for some natural data sets and demonstrated for some synthetic data sets when the classification rule is learned from a small set of training samples drawn from some high dimensional space. The initial analysis presented in this paper shows that anti-learning is a property of data sets and is quite distinct from overfitting of a training data. Moreover, the analysis leads to a specification of some machine learning procedures which can overcome anti-learning and generate ma- chines able to classify training and test data consistently.

ei

PDF [BibTex]

PDF [BibTex]


no image
Selective integration of multiple biological data for supervised network inference

Kato, T., Tsuda, K., Asai, K.

Bioinformatics, 21(10):2488 , October 2005 (article)

ei

PDF [BibTex]

PDF [BibTex]


no image
Assessing Approximate Inference for Binary Gaussian Process Classification

Kuss, M., Rasmussen, C.

Journal of Machine Learning Research, 6, pages: 1679 , October 2005 (article)

Abstract
Gaussian process priors can be used to define flexible, probabilistic classification models. Unfortunately exact Bayesian inference is analytically intractable and various approximation techniques have been proposed. In this work we review and compare Laplace‘s method and Expectation Propagation for approximate Bayesian inference in the binary Gaussian process classification model. We present a comprehensive comparison of the approximations, their predictive performance and marginal likelihood estimates to results obtained by MCMC sampling. We explain theoretically and corroborate empirically the advantages of Expectation Propagation compared to Laplace‘s method.

ei

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
Implicit Surfaces For Modelling Human Heads

Steinke, F.

Biologische Kybernetik, Eberhard-Karls-Universität, Tübingen, September 2005 (diplomathesis)

ei

[BibTex]

[BibTex]