Header logo is


2010


no image
A generative model approach for decoding in the visual event-related potential-based brain-computer interface speller

Martens, SMM., Leiva, JM.

Journal of Neural Engineering, 7(2):1-10, April 2010 (article)

Abstract
There is a strong tendency towards discriminative approaches in brain-computer interface (BCI) research. We argue that generative model-based approaches are worth pursuing and propose a simple generative model for the visual ERP-based BCI speller which incorporates prior knowledge about the brain signals. We show that the proposed generative method needs less training data to reach a given letter prediction performance than the state of the art discriminative approaches.

ei

PDF PDF DOI [BibTex]

2010


PDF PDF DOI [BibTex]


no image
Hilbert Space Embeddings and Metrics on Probability Measures

Sriperumbudur, B., Gretton, A., Fukumizu, K., Schölkopf, B., Lanckriet, G.

Journal of Machine Learning Research, 11, pages: 1517-1561, April 2010 (article)

ei

PDF [BibTex]

PDF [BibTex]


no image
Graph Kernels

Vishwanathan, SVN., Schraudolph, NN., Kondor, R., Borgwardt, KM.

Journal of Machine Learning Research, 11, pages: 1201-1242, April 2010 (article)

Abstract
We present a unified framework to study graph kernels, special cases of which include the random walk (G{\"a}rtner et al., 2003; Borgwardt et al., 2005) and marginalized (Kashima et al., 2003, 2004; Mahét al., 2004) graph kernels. Through reduction to a Sylvester equation we improve the time complexity of kernel computation between unlabeled graphs with n vertices from O(n6) to O(n3). We find a spectral decomposition approach even more efficient when computing entire kernel matrices. For labeled graphs we develop conjugate gradient and fixed-point methods that take O(dn3) time per iteration, where d is the size of the label set. By extending the necessary linear algebra to Reproducing Kernel Hilbert Spaces (RKHS) we obtain the same result for d-dimensional edge kernels, and O(n4) in the infinite-dimensional case; on sparse graphs these algorithms only take O(n2) time per iteration in all cases. Experiments on graphs from bioinformatics and other application domains show that these techniques can speed up computation of the kernel by an order of magnitude or more. We also show that certain rational kernels (Cortes et al., 2002, 2003, 2004) when specialized to graphs reduce to our random walk graph kernel. Finally, we relate our framework to R-convolution kernels (Haussler, 1999) and provide a kernel that is close to the optimal assignment kernel of kernel of Fr{\"o}hlich et al. (2006) yet provably positive semi-definite.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Gene function prediction from synthetic lethality networks via ranking on demand

Lippert, C., Ghahramani, Z., Borgwardt, KM.

Bioinformatics, 26(7):912-918, April 2010 (article)

Abstract
Motivation: Synthetic lethal interactions represent pairs of genes whose individual mutations are not lethal, while the double mutation of both genes does incur lethality. Several studies have shown a correlation between functional similarity of genes and their distances in networks based on synthetic lethal interactions. However, there is a lack of algorithms for predicting gene function from synthetic lethality interaction networks. Results: In this article, we present a novel technique called kernelROD for gene function prediction from synthetic lethal interaction networks based on kernel machines. We apply our novel algorithm to Gene Ontology functional annotation prediction in yeast. Our experiments show that our method leads to improved gene function prediction compared with state-of-the-art competitors and that combining genetic and congruence networks leads to a further improvement in prediction accuracy.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Cooperative Cuts: Graph Cuts with Submodular Edge Weights

Jegelka, S., Bilmes, J.

(189), Max Planck Institute for Biological Cybernetics, Tuebingen, Germany, March 2010 (techreport)

Abstract
We introduce a problem we call Cooperative cut, where the goal is to find a minimum-cost graph cut but where a submodular function is used to define the cost of a subsets of edges. That means, the cost of an edge that is added to the current cut set C depends on the edges in C. This generalization of the cost in the standard min-cut problem to a submodular cost function immediately makes the problem harder. Not only do we prove NP hardness even for nonnegative submodular costs, but also show a lower bound of Omega(|V|^(1/3)) on the approximation factor for the problem. On the positive side, we propose and compare four approximation algorithms with an overall approximation factor of min { |V|/2, |C*|, O( sqrt(|E|) log |V|), |P_max|}, where C* is the optimal solution, and P_max is the longest s, t path across the cut between given s, t. We also introduce additional heuristics for the problem which have attractive properties from the perspective of practical applications and implementations in that existing fast min-cut libraries may be used as subroutines. Both our approximation algorithms, and our heuristics, appear to do well in practice.

ei

PDF [BibTex]

PDF [BibTex]


no image
A toolbox for predicting G-quadruplex formation and stability

Wong, HM., Stegle, O., Rodgers, S., Huppert, J.

Journal of Nucleic Acids, 2010(564946):1-6, March 2010 (article)

Abstract
G-quadruplexes are four stranded nucleic acid structures formed around a core of guanines, arranged in squares with mutual hydrogen bonding. Many of these structures are highly thermally stable, especially in the presence of monovalent cations, such as those found under physiological conditions. Understanding of their physiological roles is expanding rapidly, and they have been implicated in regulating gene transcription and translation among other functions. We have built a community-focused website to act as a repository for the information that is now being developed. At its core, this site has a detailed database (QuadDB) of predicted G-quadruplexes in the human and other genomes, together with the predictive algorithm used to identify them. We also provide a QuadPredict server, which predicts thermal stability and acts as a repository for experimental data from all researchers. There are also a number of other data sources with computational predictions. We anticipate that the wide availability of this information will be of use both to researchers already active in this exciting field and to those who wish to investigate a particular gene hypothesis.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
A Novel Protocol for Accuracy Assessment in Classification of Very High Resolution Images

Persello, C., Bruzzone, L.

IEEE Transactions on Geoscience and Remote Sensing, 48(3):1232-1244, March 2010 (article)

Abstract
This paper presents a novel protocol for the accuracy assessment of the thematic maps obtained by the classification of very high resolution images. As the thematic accuracy alone is not sufficient to adequately characterize the geometrical properties of high-resolution classification maps, we propose a protocol that is based on the analysis of two families of indices: 1) the traditional thematic accuracy indices and 2) a set of novel geometric indices that model different geometric properties of the objects recognized in the map. In this context, we present a set of indices that characterize five different types of geometric errors in the classification map: 1) oversegmentation; 2) undersegmentation; 3) edge location; 4) shape distortion; and 5) fragmentation. Moreover, we propose a new approach for tuning the free parameters of supervised classifiers on the basis of a multiobjective criterion function that aims at selecting the parameter values that result in the classification map that jointly optimize thematic and geometric error indices. Experimental results obtained on QuickBird images show the effectiveness of the proposed protocol in selecting classification maps characterized by a better tradeoff between thematic and geometric accuracies than standard procedures based only on thematic accuracy measures. In addition, results obtained with support vector machine classifiers confirm the effectiveness of the proposed multiobjective technique for the selection of free-parameter values for the classification algorithm.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
On the Entropy Production of Time Series with Unidirectional Linearity

Janzing, D.

Journal of Statistical Physics, 138(4-5):767-779, March 2010 (article)

Abstract
There are non-Gaussian time series that admit a causal linear autoregressive moving average (ARMA) model when regressing the future on the past, but not when regressing the past on the future. The reason is that, in the latter case, the regression residuals are not statistically independent of the regressor. In previous work, we have experimentally verified that many empirical time series indeed show such a time inversion asymmetry. For various physical systems, it is known that time-inversion asymmetries are linked to the thermodynamic entropy production in non-equilibrium states. Here we argue that unidirectional linearity is also accompanied by entropy generation. To this end, we study the dynamical evolution of a physical toy system with linear coupling to an infinite environment and show that the linearity of the dynamics is inherited by the forward-time conditional probabilities, but not by the backward-time conditionals. The reason is that the environment permanently provides particles that are in a product state before they interact with the system, but show statistical dependence afterwards. From a coarse-grained perspective, the interaction thus generates entropy. We quantitatively relate the strength of the non-linearity of the backward process to the minimal amount of entropy generation. The paper thus shows that unidirectional linearity is an indirect implication of the thermodynamic arrow of time, given that the joint dynamics of the system and its environment is linear.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning

Morimura, T., Uchibe, E., Yoshimoto, J., Peters, J., Doya, K.

Neural Computation, 22(2):342-376, February 2010 (article)

Abstract
Most conventional policy gradient reinforcement learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the policy parameter. That term involves the derivative of the stationary state distribution that corresponds to the sensitivity of its distribution to changes in the policy parameter. Although the bias introduced by this omission can be reduced by setting the forgetting rate γ for the value functions close to 1, these algorithms do not permit γ to be set exactly at γ = 1. In this article, we propose a method for estimating the log stationary state distribution derivative (LSD) as a useful form of the derivative of the stationary state distribution through backward Markov chain formulation and a temporal difference learning framework. A new policy gradient (PG) framework with an LSD is also proposed, in which the average reward gradient can be estimated by setting //!-- MFG_und--//amp;#947; = 0, so it becomes unnecessary to learn the value functions. We also test the performance of the proposed algorithms using simple benchmark tasks and show that these can improve the performances of existing PG methods.

ei

PDF Web DOI [BibTex]

PDF Web DOI [BibTex]


no image
Bayesian Online Multitask Learning of Gaussian Processes

Pillonetto, G., Dinuzzo, F., De Nicolao, G.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(2):193-205, February 2010 (article)

Abstract
Standard single-task kernel methods have recently been extended to the case of multitask learning in the context of regularization theory. There are experimental results, especially in biomedicine, showing the benefit of the multitask approach compared to the single-task one. However, a possible drawback is computational complexity. For instance, when regularization networks are used, complexity scales as the cube of the overall number of training data, which may be large when several tasks are involved. The aim of this paper is to derive an efficient computational scheme for an important class of multitask kernels. More precisely, a quadratic loss is assumed and each task consists of the sum of a common term and a task-specific one. Within a Bayesian setting, a recursive online algorithm is obtained, which updates both estimates and confidence intervals as new data become available. The algorithm is tested on two simulated problems and a real data set relative to xenobiotics administration in human patients.

ei

DOI [BibTex]

DOI [BibTex]


no image
The semigroup approach to transport processes in networks

Dorn, B., Fijavz, M., Nagel, R., Radl, A.

Physica D: Nonlinear Phenomena, 239(15):1416-1421, January 2010 (article)

Abstract
We explain how operator semigroups can be used to study transport processes in networks. This method is applied to a linear Boltzmann equation on a finite as well as on an infinite network and yields well-posedness and information on the long term behavior of the solutions to the presented problems.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Optimization of k-Space Trajectories for Compressed Sensing by Bayesian Experimental Design

Seeger, M., Nickisch, H., Pohmann, R., Schölkopf, B.

Magnetic Resonance in Medicine, 63(1):116-126, January 2010 (article)

Abstract
The optimization of k-space sampling for nonlinear sparse MRI reconstruction is phrased as a Bayesian experimental design problem. Bayesian inference is approximated by a novel relaxation to standard signal processing primitives, resulting in an efficient optimization algorithm for Cartesian and spiral trajectories. On clinical resolution brain image data from a Siemens 3T scanner, automatically optimized trajectories lead to significantly improved images, compared to standard low-pass, equispaced, or variable density randomized designs. Insights into the nonlinear design optimization problem for MRI are given.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Consistent Nonparametric Tests of Independence

Gretton, A., Györfi, L.

Journal of Machine Learning Research, 11, pages: 1391-1423, 2010 (article)

ei

PDF [BibTex]

PDF [BibTex]


no image
Inferring latent task structure for Multitask Learning by Multiple Kernel Learning

Widmer, C., Toussaint, N., Altun, Y., Rätsch, G.

BMC Bioinformatics, 11 Suppl 8, pages: S5, 2010 (article)

Abstract
The lack of sufficient training data is the limiting factor for many Machine Learning applications in Computational Biology. If data is available for several different but related problem domains, Multitask Learning algorithms can be used to learn a model based on all available information. In Bioinformatics, many problems can be cast into the Multitask Learning scenario by incorporating data from several organisms. However, combining information from several tasks requires careful consideration of the degree of similarity between tasks. Our proposed method simultaneously learns or refines the similarity between tasks along with the Multitask Learning classifier. This is done by formulating the Multitask Learning problem as Multiple Kernel Learning, using the recently published q-Norm MKL algorithm.

ei

Web DOI [BibTex]

Web DOI [BibTex]


no image
Information-theoretic inference of common ancestors

Steudel, B., Ay, N.

Computing Research Repository (CoRR), abs/1010.5720, pages: 18, 2010 (techreport)

ei

Web [BibTex]

Web [BibTex]


Molecular QED of coherent and incoherent sum-frequency and second-harmonic generation in chiral liquids in the presence of a static electric field
Molecular QED of coherent and incoherent sum-frequency and second-harmonic generation in chiral liquids in the presence of a static electric field

Fischer, P., Salam, A.

MOLECULAR PHYSICS, 108(14):1857-1868, 2010 (article)

Abstract
Coherent second-order nonlinear optical processes are symmetry forbidden in centrosymmetric environments in the electric-dipole approximation. In liquids that contain chiral molecules, however, and which therefore lack mirror image symmetry, coherent sum-frequency generation is possible, whereas second-harmonic generation remains forbidden. Here we apply the theory of molecular quantum electrodynamics to the calculation of the matrix element, transition rate, and integrated signal intensity for sum-frequency and second-harmonic generation taking place in a chiral liquid in the presence and absence of a static electric field, to examine which coherent and incoherent processes exist in the electric-dipole approximation in liquids. Third- and fourth-order time-dependent perturbation theory is employed in combination with single-sided Feynman diagrams to evaluate two contributions arising from static field-free and field-induced processes. It is found that, in addition to the coherent term, an incoherent process exists for sum-frequency generation in liquids. Surprisingly, in the case of dc-field-induced second-harmonic generation, the incoherent contribution is found to always vanish for isotropic chiral liquids even though hyper-Rayleigh second-harmonic generation and electric-field-induced second-harmonic generation are both independently symmetry allowed in any liquid.

pf

DOI [BibTex]

2004


no image
On the representation, learning and transfer of spatio-temporal movement characteristics

Ilg, W., Bakir, GH., Mezger, J., Giese, M.

International Journal of Humanoid Robotics, 1(4):613-636, December 2004 (article)

ei

[BibTex]

2004


[BibTex]


no image
Insect-inspired estimation of egomotion

Franz, MO., Chahl, JS., Krapp, HG.

Neural Computation, 16(11):2245-2260, November 2004 (article)

Abstract
Tangential neurons in the fly brain are sensitive to the typical optic flow patterns generated during egomotion. In this study, we examine whether a simplified linear model based on the organization principles in tangential neurons can be used to estimate egomotion from the optic flow. We present a theory for the construction of an estimator consisting of a linear combination of optic flow vectors that incorporates prior knowledge both about the distance distribution of the environment, and about the noise and egomotion statistics of the sensor. The estimator is tested on a gantry carrying an omnidirectional vision sensor. The experiments show that the proposed approach leads to accurate and robust estimates of rotation rates, whereas translation estimates are of reasonable quality, albeit less reliable.

ei

PDF PostScript Web DOI [BibTex]

PDF PostScript Web DOI [BibTex]


no image
Efficient face detection by a cascaded support-vector machine expansion

Romdhani, S., Torr, P., Schölkopf, B., Blake, A.

Proceedings of The Royal Society of London A, 460(2501):3283-3297, A, November 2004 (article)

Abstract
We describe a fast system for the detection and localization of human faces in images using a nonlinear ‘support-vector machine‘. We approximate the decision surface in terms of a reduced set of expansion vectors and propose a cascaded evaluation which has the property that the full support-vector expansion is only evaluated on the face-like parts of the image, while the largest part of typical images is classified using a single expansion vector (a simpler and more efficient classifier). As a result, only three reduced-set vectors are used, on average, to classify an image patch. Hence, the cascaded evaluation, presented in this paper, offers a thirtyfold speed-up over an evaluation using the full set of reduced-set vectors, which is itself already thirty times faster than classification using all the support vectors.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Joint Kernel Maps

Weston, J., Schölkopf, B., Bousquet, O., Mann, .., Noble, W.

(131), Max-Planck-Institute for Biological Cybernetics, Tübingen, November 2004 (techreport)

ei

PDF [BibTex]

PDF [BibTex]


no image
Semi-Supervised Induction

Yu, K., Tresp, V., Zhou, D.

(141), Max Planck Institute for Biological Cybernetics, Tuebingen, Germany, August 2004 (techreport)

Abstract
Considerable progress was recently achieved on semi-supervised learning, which differs from the traditional supervised learning by additionally exploring the information of the unlabelled examples. However, a disadvantage of many existing methods is that it does not generalize to unseen inputs. This paper investigates learning methods that effectively make use of both labelled and unlabelled data to build predictive functions, which are defined on not just the seen inputs but the whole space. As a nice property, the proposed method allows effcient training and can easily handle new test points. We validate the method based on both toy data and real world data sets.

ei

PDF PDF [BibTex]

PDF PDF [BibTex]


no image
Learning kernels from biological networks by maximizing entropy

Tsuda, K., Noble, W.

Bioinformatics, 20(Suppl. 1):i326-i333, August 2004 (article)

Abstract
Motivation: The diffusion kernel is a general method for computing pairwise distances among all nodes in a graph, based on the sum of weighted paths between each pair of nodes. This technique has been used successfully, in conjunction with kernel-based learning methods, to draw inferences from several types of biological networks. Results: We show that computing the diffusion kernel is equivalent to maximizing the von Neumann entropy, subject to a global constraint on the sum of the Euclidean distances between nodes. This global constraint allows for high variance in the pairwise distances. Accordingly, we propose an alternative, locally constrained diffusion kernel, and we demonstrate that the resulting kernel allows for more accurate support vector machine prediction of protein functional classifications from metabolic and protein–protein interaction networks.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Masking effect produced by Mach bands on the detection of narrow bars of random polarity

Henning, GB., Hoddinott, KT., Wilson-Smith, ZJ., Hill, NJ.

Journal of the Optical Society of America, 21(8):1379-1387, A, August 2004 (article)

ei

[BibTex]

[BibTex]


no image
Object categorization with SVM: kernels for local features

Eichhorn, J., Chapelle, O.

(137), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, July 2004 (techreport)

Abstract
In this paper, we propose to combine an efficient image representation based on local descriptors with a Support Vector Machine classifier in order to perform object categorization. For this purpose, we apply kernels defined on sets of vectors. After testing different combinations of kernel / local descriptors, we have been able to identify a very performant one.

ei

PDF [BibTex]

PDF [BibTex]


no image
Hilbertian Metrics and Positive Definite Kernels on Probability Measures

Hein, M., Bousquet, O.

(126), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, July 2004 (techreport)

Abstract
We investigate the problem of defining Hilbertian metrics resp. positive definite kernels on probability measures, continuing previous work. This type of kernels has shown very good results in text classification and has a wide range of possible applications. In this paper we extend the two-parameter family of Hilbertian metrics of Topsoe such that it now includes all commonly used Hilbertian metrics on probability measures. This allows us to do model selection among these metrics in an elegant and unified way. Second we investigate further our approach to incorporate similarity information of the probability space into the kernel. The analysis provides a better understanding of these kernels and gives in some cases a more efficient way to compute them. Finally we compare all proposed kernels in two text and one image classification problem.

ei

PDF [BibTex]

PDF [BibTex]


no image
Kernels, Associated Structures and Generalizations

Hein, M., Bousquet, O.

(127), Max Planck Institute for Biological Cybernetics, Tübingen, Germany, July 2004 (techreport)

Abstract
This paper gives a survey of results in the mathematical literature on positive definite kernels and their associated structures. We concentrate on properties which seem potentially relevant for Machine Learning and try to clarify some results that have been misused in the literature. Moreover we consider different lines of generalizations of positive definite kernels. Namely we deal with operator-valued kernels and present the general framework of Hilbertian subspaces of Schwartz which we use to introduce kernels which are distributions. Finally indefinite kernels and their associated reproducing kernel spaces are considered.

ei

PDF [BibTex]

PDF [BibTex]


no image
Support Vector Channel Selection in BCI

Lal, T., Schröder, M., Hinterberger, T., Weston, J., Bogdan, M., Birbaumer, N., Schölkopf, B.

IEEE Transactions on Biomedical Engineering, 51(6):1003-1010, June 2004 (article)

Abstract
Designing a Brain Computer Interface (BCI) system one can choose from a variety of features that may be useful for classifying brain activity during a mental task. For the special case of classifying EEG signals we propose the usage of the state of the art feature selection algorithms Recursive Feature Elimination and Zero-Norm Optimization which are based on the training of Support Vector Machines (SVM). These algorithms can provide more accurate solutions than standard filter methods for feature selection. We adapt the methods for the purpose of selecting EEG channels. For a motor imagery paradigm we show that the number of used channels can be reduced significantly without increasing the classification error. The resulting best channels agree well with the expected underlying cortical activity patterns during the mental tasks. Furthermore we show how time dependent task specific information can be visualized.

ei

DOI [BibTex]

DOI [BibTex]


no image
Distance-Based Classification with Lipschitz Functions

von Luxburg, U., Bousquet, O.

Journal of Machine Learning Research, 5, pages: 669-695, June 2004 (article)

Abstract
The goal of this article is to develop a framework for large margin classification in metric spaces. We want to find a generalization of linear decision functions for metric spaces and define a corresponding notion of margin such that the decision function separates the training points with a large margin. It will turn out that using Lipschitz functions as decision functions, the inverse of the Lipschitz constant can be interpreted as the size of a margin. In order to construct a clean mathematical setup we isometrically embed the given metric space into a Banach space and the space of Lipschitz functions into its dual space. To analyze the resulting algorithm, we prove several representer theorems. They state that there always exist solutions of the Lipschitz classifier which can be expressed in terms of distance functions to training points. We provide generalization bounds for Lipschitz classifiers in terms of the Rademacher complexities of some Lipschitz function classes. The generality of our approach can be seen from the fact that several well-known algorithms are special cases of the Lipschitz classifier, among them the support vector machine, the linear programming machine, and the 1-nearest neighbor classifier.

ei

PDF PostScript PDF [BibTex]

PDF PostScript PDF [BibTex]


no image
cDNA-Microarray Technology in Cartilage Research - Functional Genomics of Osteoarthritis [in German]

Aigner, T., Finger, F., Zien, A., Bartnik, E.

Zeitschrift f{\"u}r Orthop{\"a}die und ihre Grenzgebiete, 142(2):241-247, April 2004 (article)

Abstract
Functional genomics represents a new challenging approach in order to analyze complex diseases such as osteoarthritis on a molecular level. The characterization of the molecular changes of the cartilage cells, the chondrocytes, enables a better understanding of the pathomechanisms of the disease. In particular, the identification and characterization of new target molecules for therapeutic intervention is of interest. Also, potential molecular markers for diagnosis and monitoring of osteoarthritis contribute to a more appropriate patient management. The DNA-microarray technology complements (but does not replace) biochemical and biological research in new disease-relevant genes. Large-scale functional genomics will identify molecular networks such as yet identified players in the anabolic-catabolic balance of articular cartilage as well as disease-relevant intracellular signaling cascades so far rather unknown in articular chondrocytes. However, at the moment it is also important to recognize the limitations of the microarray technology in order to avoid over-interpretation of the results. This might lead to misleading results and prevent to a significant extent a proper use of the potential of this technology in the field of osteoarthritis.

ei

[BibTex]

[BibTex]


no image
A Compression Approach to Support Vector Model Selection

von Luxburg, U., Bousquet, O., Schölkopf, B.

Journal of Machine Learning Research, 5, pages: 293-323, April 2004 (article)

Abstract
In this paper we investigate connections between statistical learning theory and data compression on the basis of support vector machine (SVM) model selection. Inspired by several generalization bounds we construct "compression coefficients" for SVMs which measure the amount by which the training labels can be compressed by a code built from the separating hyperplane. The main idea is to relate the coding precision to geometrical concepts such as the width of the margin or the shape of the data in the feature space. The so derived compression coefficients combine well known quantities such as the radius-margin term R^2/rho^2, the eigenvalues of the kernel matrix, and the number of support vectors. To test whether they are useful in practice we ran model selection experiments on benchmark data sets. As a result we found that compression coefficients can fairly accurately predict the parameters for which the test error is minimized.

ei

PDF [BibTex]

PDF [BibTex]


no image
Kamerakalibrierung und Tiefenschätzung: Ein Vergleich von klassischer Bündelblockausgleichung und statistischen Lernalgorithmen

Sinz, FH.

Wilhelm-Schickard-Institut für Informatik, Universität Tübingen, Tübingen, Germany, March 2004 (techreport)

Abstract
Die Arbeit verleicht zwei Herangehensweisen an das Problem der Sch{\"a}tzung der r{\"a}umliche Position eines Punktes aus den Bildkoordinaten in zwei verschiedenen Kameras. Die klassische Methode der B{\"u}ndelblockausgleichung modelliert zwei Einzelkameras und sch{\"a}tzt deren {\"a}ußere und innere Orientierung mit einer iterativen Kalibrationsmethode, deren Konvergenz sehr stark von guten Startwerten abh{\"a}ngt. Die Tiefensch{\"a}tzung eines Punkts geschieht durch die Invertierung von drei der insgesamt vier Projektionsgleichungen der Einzalkameramodelle. Die zweite Methode benutzt Kernel Ridge Regression und Support Vector Regression, um direkt eine Abbildung von den Bild- auf die Raumkoordinaten zu lernen. Die Resultate zeigen, daß der Ansatz mit maschinellem Lernen, neben einer erheblichen Vereinfachung des Kalibrationsprozesses, zu h{\"o}heren Positionsgenaugikeiten f{\"u}hren kann.

ei

PDF [BibTex]

PDF [BibTex]


no image
Experimentally optimal v in support vector regression for different noise models and parameter settings

Chalimourda, A., Schölkopf, B., Smola, A.

Neural Networks, 17(1):127-141, January 2004 (article)

Abstract
In Support Vector (SV) regression, a parameter ν controls the number of Support Vectors and the number of points that come to lie outside of the so-called var epsilon-insensitive tube. For various noise models and SV parameter settings, we experimentally determine the values of ν that lead to the lowest generalization error. We find good agreement with the values that had previously been predicted by a theoretical argument based on the asymptotic efficiency of a simplified model of SV regression. As a side effect of the experiments, valuable information about the generalization behavior of the remaining SVM parameters and their dependencies is gained. The experimental findings are valid even for complex ‘real-world’ data sets. Based on our results on the role of the ν-SVM parameters, we discuss various model selection methods.

ei

PDF DOI [BibTex]

PDF DOI [BibTex]


no image
Protein ranking: from local to global structure in the protein similarity network

Weston, J., Elisseeff, A., Zhou, D., Leslie, C., Noble, W.

Proceedings of the National Academy of Science, 101(17):6559-6563, 2004 (article)

Abstract
Biologists regularly search databases of DNA or protein sequences for evolutionary or functional relationships to a given query sequence. We describe a ranking algorithm that exploits the entire network structure of similarity relationships among proteins in a sequence database by performing a diffusion operation on a pre-computed, weighted network. The resulting ranking algorithm, evaluated using a human-curated database of protein structures, is efficient and provides significantly better rankings than a local network search algorithm such as PSI-BLAST.

ei

Web [BibTex]

Web [BibTex]


no image
Multivariate Regression with Stiefel Constraints

Bakir, G., Gretton, A., Franz, M., Schölkopf, B.

(128), MPI for Biological Cybernetics, Spemannstr 38, 72076, Tuebingen, 2004 (techreport)

Abstract
We introduce a new framework for regression between multi-dimensional spaces. Standard methods for solving this problem typically reduce the problem to one-dimensional regression by choosing features in the input and/or output spaces. These methods, which include PLS (partial least squares), KDE (kernel dependency estimation), and PCR (principal component regression), select features based on different a-priori judgments as to their relevance. Moreover, loss function and constraints are chosen not primarily on statistical grounds, but to simplify the resulting optimisation. By contrast, in our approach the feature construction and the regression estimation are performed jointly, directly minimizing a loss function that we specify, subject to a rank constraint. A major advantage of this approach is that the loss is no longer chosen according to the algorithmic requirements, but can be tailored to the characteristics of the task at hand; the features will then be optimal with respect to this objective. Our approach also allows for the possibility of using a regularizer in the optimization. Finally, by processing the observations sequentially, our algorithm is able to work on large scale problems.

ei

PDF [BibTex]

PDF [BibTex]


no image
Statistical Performance of Support Vector Machines

Blanchard, G., Bousquet, O., Massart, P.

2004 (article)

ei

PostScript [BibTex]


no image
Asymptotic Properties of the Fisher Kernel

Tsuda, K., Akaho, S., Kawanabe, M., Müller, K.

Neural Computation, 16(1):115-137, 2004 (article)

ei

PDF [BibTex]

PDF [BibTex]


no image
Some observations on the effects of slant and texture type on slant-from-texture

Rosas, P., Wichmann, F., Wagemans, J.

Vision Research, 44(13):1511-1535, 2004 (article)

Abstract
We measure the performance of five subjects in a slant-discrimination task for differently textured planes. As textures we used uniform lattices, randomly displaced lattices, circles (polka dots), Voronoi tessellations, plaids, 1/f noise, “coherent” noise and a leopard skin-like texture. Our results show: (1) Improving performance with larger slants for all textures. (2) Thus, following from (1), cases of “non-symmetrical” performance around a particular orientation. (3) For orientations sufficiently slanted, the different textures do not elicit major differences in performance, (4) while for orientations closer to the vertical plane there are marked differences between them. (5) These differences allow a rank-order of textures to be formed according to their “helpfulness”– that is, how easy the discrimination task is when a particular texture is mapped on the plane. Polka dots tend to allow the best slant discrimination performance, noise patterns the worst. Two additional experiments were conducted to test the generality of the obtained rank-order. First, the tilt of the planes was rotated to break the axis of gravity present in the original discrimination experiment. Second, the task was changed to a slant report task via probe adjustment. The results of both control experiments confirmed the texture-based rank-order previously obtained. We comment on the importance of these results for depth perception research in general, and in particular the implications our results have for studies of cue combination (sensor fusion) using texture as one of the cues involved.

ei

PDF [BibTex]

PDF [BibTex]


no image
Learning from Labeled and Unlabeled Data Using Random Walks

Zhou, D., Schölkopf, B.

Max Planck Institute for Biological Cybernetics, 2004 (techreport)

Abstract
We consider the general problem of learning from labeled and unlabeled data. Given a set of points, some of them are labeled, and the remaining points are unlabeled. The goal is to predict the labels of the unlabeled points. Any supervised learning algorithm can be applied to this problem, for instance, Support Vector Machines (SVMs). The problem of our interest is if we can implement a classifier which uses the unlabeled data information in some way and has higher accuracy than the classifiers which use the labeled data only. Recently we proposed a simple algorithm, which can substantially benefit from large amounts of unlabeled data and demonstrates clear superiority to supervised learning methods. In this paper we further investigate the algorithm using random walks and spectral graph theory, which shed light on the key steps in this algorithm.

ei

PDF PostScript [BibTex]

PDF PostScript [BibTex]


no image
Minimizing the Cross Validation Error to Mix Kernel Matrices of Heterogeneous Biological Data

Tsuda, K., Uda, S., Kin, T., Asai, K.

Neural Processing Letters, 19, pages: 63-72, 2004 (article)

ei

PDF [BibTex]

PDF [BibTex]


no image
A Tutorial on Support Vector Regression

Smola, A., Schölkopf, B.

Statistics and Computing, 14(3):199-222, 2004 (article)

ei

Web [BibTex]

Web [BibTex]


no image
Behaviour and Convergence of the Constrained Covariance

Gretton, A., Smola, A., Bousquet, O., Herbrich, R., Schölkopf, B., Logothetis, N.

(130), MPI for Biological Cybernetics, 2004 (techreport)

Abstract
We discuss reproducing kernel Hilbert space (RKHS)-based measures of statistical dependence, with emphasis on constrained covariance (COCO), a novel criterion to test dependence of random variables. We show that COCO is a test for independence if and only if the associated RKHSs are universal. That said, no independence test exists that can distinguish dependent and independent random variables in all circumstances. Dependent random variables can result in a COCO which is arbitrarily close to zero when the source densities are highly non-smooth, which can make dependence hard to detect empirically. All current kernel-based independence tests share this behaviour. Finally, we demonstrate exponential convergence between the population and empirical COCO, which implies that COCO does not suffer from slow learning rates when used as a dependence test.

ei

PDF [BibTex]

PDF [BibTex]


no image
Bayesian analysis of the Scatterometer Wind Retrieval Inverse Problem: Some New Approaches

Cornford, D., Csato, L., Evans, D., Opper, M.

Journal of the Royal Statistical Society B, 66, pages: 1-17, 3, 2004 (article)

Abstract
The retrieval of wind vectors from satellite scatterometer observations is a non-linear inverse problem.A common approach to solving inverse problems is to adopt a Bayesian framework and to infer the posterior distribution of the parameters of interest given the observations by using a likelihood model relating the observations to the parameters, and a prior distribution over the parameters.We show how Gaussian process priors can be used efficiently with a variety of likelihood models, using local forward (observation) models and direct inverse models for the scatterometer.We present an enhanced Markov chain Monte Carlo method to sample from the resulting multimodal posterior distribution.We go on to show how the computational complexity of the inference can be controlled by using a sparse, sequential Bayes algorithm for estimation with Gaussian processes.This helps to overcome the most serious barrier to the use of probabilistic, Gaussian process methods in remote sensing inverse problems, which is the prohibitively large size of the data sets.We contrast the sampling results with the approximations that are found by using the sparse, sequential Bayes algorithm.

ei

PDF [BibTex]

PDF [BibTex]


no image
Feature Selection for Support Vector Machines Using Genetic Algorithms

Fröhlich, H., Chapelle, O., Schölkopf, B.

International Journal on Artificial Intelligence Tools (Special Issue on Selected Papers from the 15th IEEE International Conference on Tools with Artificial Intelligence 2003), 13(4):791-800, 2004 (article)

ei

Web [BibTex]

Web [BibTex]


no image
Confidence Sets for Ratios: A Purely Geometric Approach To Fieller’s Theorem

von Luxburg, U., Franz, V.

(133), Max Planck Institute for Biological Cybernetics, 2004 (techreport)

Abstract
We present a simple, geometric method to construct Fieller's exact confidence sets for ratios of jointly normally distributed random variables. Contrary to previous geometric approaches in the literature, our method is valid in the general case where both sample mean and covariance are unknown. Moreover, not only the construction but also its proof are purely geometric and elementary, thus giving intuition into the nature of the confidence sets.

ei

PDF [BibTex]

PDF [BibTex]


no image
Transductive Inference with Graphs

Zhou, D., Schölkopf, B.

Max Planck Institute for Biological Cybernetics, 2004, See the improved version Regularization on Discrete Spaces. (techreport)

Abstract
We propose a general regularization framework for transductive inference. The given data are thought of as a graph, where the edges encode the pairwise relationships among data. We develop discrete analysis and geometry on graphs, and then naturally adapt the classical regularization in the continuous case to the graph situation. A new and effective algorithm is derived from this general framework, as well as an approach we developed before.

ei

[BibTex]

[BibTex]


no image
Phenotypic Characterization of Human Chondrocyte Cell Line C-20/A4: A Comparison between Monolayer and Alginate Suspension Culture

Finger, F., Schorle, C., Söder, S., Zien, A., Goldring, M., Aigner, T.

Cells Tissues Organs, 178(2):65-77, 2004 (article)

Abstract
DNA microarray analysis was used to investigate the molecular phenotype of one of the first human chondrocyte cell lines, C-20/A4, derived from juvenile costal chondrocytes by immortalization with origin-defective simian virus 40 large T antigen. Clontech Human Cancer Arrays 1.2 and quantitative PCR were used to examine gene expression profiles of C-20/A4 cells cultured in the presence of serum in monolayer and alginate beads. In monolayer cultures, genes involved in cell proliferation were strongly upregulated compared to those expressed by human adult articular chondrocytes in primary culture. Of the cell cycle-regulated genes, only two, the CDK regulatory subunit and histone H4, were downregulated after culture in alginate beads, consistent with the ability of these cells to proliferate in suspension culture. In contrast, the expression of several genes that are involved in pericellular matrix formation, including MMP-14, COL6A1, fibronectin, biglycan and decorin, was upregulated when the C-20/A4 cells were transferred to suspension culture in alginate. Also, nexin-1, vimentin, and IGFBP-3, which are known to be expressed by primary chondrocytes, were differentially expressed in our study. Consistent with the proliferative phenotype of this cell line, few genes involved in matrix synthesis and turnover were highly expressed in the presence of serum. These results indicate that immortalized chondrocyte cell lines, rather than substituting for primary chondrocytes, may serve as models for extending findings on chondrocyte function not achievable by the use of primary chondrocytes.

ei

[BibTex]

[BibTex]


no image
Kernel Methods and their Potential Use in Signal Processing

Perez-Cruz, F., Bousquet, O.

IEEE Signal Processing Magazine, (Special issue on Signal Processing for Mining), 2004 (article) Accepted

ei

PostScript [BibTex]

PostScript [BibTex]

2000


Phenomenological damping in optical response tensors
Phenomenological damping in optical response tensors

Buckingham, A., Fischer, P.

PHYSICAL REVIEW A, 61(3), 2000 (article)

Abstract
Although perturbation theory applied to the optical response of a molecule or material system is only strictly valid far from resonances, it is often applied to ``near-resonance{''} conditions by means of complex energies incorporating damping. Inconsistent signs of the damping in optical response tensors have appeared in the recent literature, as have errors in the treatment of the perturbation by a static held. The ``equal-sign{''} convention used in a recent publication yields an unphysical material response, and Koroteev's intimation that linear electro-optical circular dichroism may exist in an optically active liquid under resonance conditions is also flawed. We show that the isotropic part of the Pockels tensor vanishes.

pf

DOI [BibTex]

2000


DOI [BibTex]


Ab initio investigation of the sum-frequency hyperpolarizability of small chiral molecules
Ab initio investigation of the sum-frequency hyperpolarizability of small chiral molecules

Champagne, B., Fischer, P., Buckingham, A.

CHEMICAL PHYSICS LETTERS, 331(1):83-88, 2000 (article)

Abstract
Using a sum-over-states procedure based on configuration interaction singles /6-311++G{*}{*}, we have computed the sum-frequency hyperpolarizability beta (ijk)(-3 omega; 2 omega, omega) Of two small chiral molecules, R-monofluoro-oxirane and R-(+)-propylene oxide. Excitation energies were scaled to fit experimental UV-absorption data and checked with ab initio values from time-dependent density functional theory. The isotropic part of the computed hyperpolarizabilities, beta(-3 omega; 2 omega, omega), is much smaller than that reported previously from sum-frequency generation experiments on aqueous solutions of arabinose. Comparison is made with a single-centre chiral model. (C) 2000 Elsevier Science B.V. All rights reserved.

pf

DOI [BibTex]

DOI [BibTex]


Three-wave mixing in chiral liquids
Three-wave mixing in chiral liquids

Fischer, P., Wiersma, D., Righini, R., Champagne, B., Buckingham, A.

PHYSICAL REVIEW LETTERS, 85(20):4253-4256, 2000 (article)

Abstract
Second-order nonlinear optical frequency conversion in isotropic systems is only dipole allowed for sum- and difference-frequency generation in chiral media. We develop a single-center chiral model of the three-wave mixing (sum:frequency generation) nonlinearity and estimate its magnitude. We also report results from ab initio calculations and from three- and four-wave mixing experiments in support of the theoretical estimates. We show that the second-order susceptibility in chiral liquids is much smaller than previously thought.

pf

DOI [BibTex]

DOI [BibTex]