Header logo is


2011


no image
Efficient Similarity Search for Covariance Matrices via the Jensen-Bregman LogDet Divergence

Cherian, A., Sra, S., Banerjee, A., Papanikolopoulos, N.

In IEEE International Conference on Computer Vision, ICCV 2011, pages: 2399-2406, (Editors: DN Metaxas and L Quan and A Sanfeliu and LJ Van Gool), IEEE, 13th International Conference on Computer Vision (ICCV), 2011 (inproceedings)

ei

DOI [BibTex]

2011


DOI [BibTex]


no image
Introducing the detection of auditory error responses based on BCI technology for passive interaction

Zander, TO., Klippel, DM., Scherer, R.

In Proceedings of the 5th International Brain–Computer Interface Conference, pages: 252-255, (Editors: GR Müller-Putz and R Scherer and M Billinger and A Kreilinger and V Kaiser and C Neuper), Graz: Verlag der Technischen Universität, 2011 (inproceedings)

ei

[BibTex]

[BibTex]


no image
Generalized Dictionary Learning for Symmetric Positive Definite Matrices with Application to Nearest Neighbor Retrieval

Sra, S., Cherian, A.

In Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2011, LNCS vol 6913, Part III, pages: 318-332, (Editors: D Gunopulos and T Hofmann and D Malerba and M Vazirgiannis), Springer, 22th European Conference on Machine Learning (ECML), 2011 (inproceedings)

ei

DOI [BibTex]

DOI [BibTex]


no image
Restricted boltzmann machines as useful tool for detecting oscillatory eeg components

Balderas, D., Zander, TO., Bachl, F., Neuper, C., Scherer, R.

In Proceedings of the 5th International Brain–Computer Interface Conference, pages: 68-71, (Editors: GR Müller-Putz and R Scherer and M Billinger and A Kkreilinger and V Kaiser and C Neuper), Graz: Verlag der Technischen Universität, 2011 (inproceedings)

ei

[BibTex]

[BibTex]


no image
Hierarchical Multitask Structured Output Learning for Large-scale Sequence Segmentation

Görnitz, N., Widmer, C., Zeller, G., Kahles, A., Sonnenburg, S., Rätsch, G.

In Advances in Neural Information Processing Systems 24, pages: 2690-2698, (Editors: J Shawe-Taylor and RS Zemel and P Bartlett and FCN Pereira and KQ Weinberger), Curran Associates, Inc., Red Hook, NY, USA, Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

ei

PDF [BibTex]

PDF [BibTex]


no image
Phase transition in the family of p-resistances

Alamgir, M., von Luxburg, U.

In Advances in Neural Information Processing Systems 24, pages: 379-387, (Editors: J Shawe-Taylor and RS Zemel and P Bartlett and F Pereira and KQ Weinberger), Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

Abstract
We study the family of p-resistances on graphs for p ≥ 1. This family generalizes the standard resistance distance. We prove that for any fixed graph, for p=1, the p-resistance coincides with the shortest path distance, for p=2 it coincides with the standard resistance distance, and for p → ∞ it converges to the inverse of the minimal s-t-cut in the graph. Secondly, we consider the special case of random geometric graphs (such as k-nearest neighbor graphs) when the number n of vertices in the graph tends to infinity. We prove that an interesting phase-transition takes place. There exist two critical thresholds p^* and p^** such that if p < p^*, then the p-resistance depends on meaningful global properties of the graph, whereas if p > p^**, it only depends on trivial local quantities and does not convey any useful information. We can explicitly compute the critical values: p^* = 1 + 1/(d-1) and p^** = 1 + 1/(d-2) where d is the dimension of the underlying space (we believe that the fact that there is a small gap between p^* and p^** is an artifact of our proofs. We also relate our findings to Laplacian regularization and suggest to use q-Laplacians as regularizers, where q satisfies 1/p^* + 1/q = 1.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
On Fast Approximate Submodular Minimization

Jegelka, S., Lin, H., Bilmes, J.

In Advances in Neural Information Processing Systems 24, pages: 460-468, (Editors: J Shawe-Taylor and RS Zemel and P Bartlett and F Pereira and KQ Weinberger), Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

Abstract
We are motivated by an application to extract a representative subset of machine learning training data and by the poor empirical performance we observe of the popular minimum norm algorithm. In fact, for our application, minimum norm can have a running time of about O(n7) (O(n5) oracle calls). We therefore propose a fast approximate method to minimize arbitrary submodular functions. For a large sub-class of submodular functions, the algorithm is exact. Other submodular functions are iteratively approximated by tight submodular upper bounds, and then repeatedly optimized. We show theoretical properties, and empirical results suggest significant speedups over minimum norm while retaining higher accuracies.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
PAC-Bayesian Analysis of Contextual Bandits

Seldin, Y., Auer, P., Laviolette, F., Shawe-Taylor, J., Ortner, R.

In Advances in Neural Information Processing Systems 24, pages: 1683-1691, (Editors: J Shawe-Taylor and RS Zemel and P Bartlett and F Pereira and KQ Weinberger), Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

Abstract
We derive an instantaneous (per-round) data-dependent regret bound for stochastic multiarmed bandits with side information (also known as contextual bandits). The scaling of our regret bound with the number of states (contexts) $N$ goes as $\sqrt{N I_{\rho_t}(S;A)}$, where $I_{\rho_t}(S;A)$ is the mutual information between states and actions (the side information) used by the algorithm at round $t$. If the algorithm uses all the side information, the regret bound scales as $\sqrt{N \ln K}$, where $K$ is the number of actions (arms). However, if the side information $I_{\rho_t}(S;A)$ is not fully used, the regret bound is significantly tighter. In the extreme case, when $I_{\rho_t}(S;A) = 0$, the dependence on the number of states reduces from linear to logarithmic. Our analysis allows to provide the algorithm large amount of side information, let the algorithm to decide which side information is relevant for the task, and penalize the algorithm only for the side information that it is using de facto. We also present an algorithm for multiarmed bandits with side information with computational complexity that is a linear in the number of actions.

ei

PDF PDF Web [BibTex]

PDF PDF Web [BibTex]


no image
Fast projections onto L1,q-norm balls for grouped feature selection

Sra, S.

In Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2011, LNCS vol 6913, Part III, pages: 305-317, (Editors: D Gunopulos and T Hofmann and D Malerba and M Vazirgiannis), Springer, 22th European Conference on Machine Learning (ECML), 2011 (inproceedings)

ei

DOI [BibTex]

DOI [BibTex]


no image
Kernel Belief Propagation

Song, L., Gretton, A., Bickson, D., Low, Y., Guestrin, C.

In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, Vol. 15, pages: 707-715, (Editors: G Gordon and D Dunson and M Dudík), JMLR, AISTATS, 2011 (inproceedings)

ei

PDF [BibTex]

PDF [BibTex]


no image
On Causal Discovery with Cyclic Additive Noise Models

Mooij, J., Janzing, D., Schölkopf, B., Heskes, T.

In Advances in Neural Information Processing Systems 24, pages: 639-647, (Editors: J Shawe-Taylor and RS Zemel and PL Bartlett and FCN Pereira and KQ Weinberger), Curran Associates, Inc., Red Hook, NY, USA, Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

Abstract
We study a particular class of cyclic causal models, where each variable is a (possibly nonlinear) function of its parents and additive noise. We prove that the causal graph of such models is generically identifiable in the bivariate, Gaussian-noise case. We also propose a method to learn such models from observational data. In the acyclic case, the method reduces to ordinary regression, but in the more challenging cyclic case, an additional term arises in the loss function, which makes it a special case of nonlinear independent component analysis. We illustrate the proposed method on synthetic data.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Additive Gaussian Processes

Duvenaud, D., Nickisch, H., Rasmussen, C.

In Advances in Neural Information Processing Systems 24, pages: 226-234, (Editors: J Shawe-Taylor and RS Zemel and P Bartlett and F Pereira and KQ Weinberger), Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

Abstract
We introduce a Gaussian process model of functions which are additive. An additive function is one which decomposes into a sum of low-dimensional functions, each depending on only a subset of the input variables. Additive GPs generalize both Generalized Additive Models, and the standard GP models which use squared-exponential kernels. Hyperparameter learning in this model can be seen as Bayesian Hierarchical Kernel Learning (HKL). We introduce an expressive but tractable parameterization of the kernel function, which allows efficient evaluation of all input interaction terms, whose number is exponential in the input dimension. The additional structure discoverable by this model results in increased interpretability, as well as state-of-the-art predictive power in regression tasks.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
k-NN Regression Adapts to Local Intrinsic Dimension

Kpotufe, S.

In Advances in Neural Information Processing Systems 24, pages: 729-737, (Editors: J Shawe-Taylor and RS Zemel and P Bartlett and F Pereira and KQ Weinberger), Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

Abstract
Many nonparametric regressors were recently shown to converge at rates that depend only on the intrinsic dimension of data. These regressors thus escape the curse of dimension when high-dimensional data has low intrinsic dimension (e.g. a manifold). We show that k-NN regression is also adaptive to intrinsic dimension. In particular our rates are local to a query x and depend only on the way masses of balls centered at x vary with radius. Furthermore, we show a simple way to choose k = k(x) locally at any x so as to nearly achieve the minimax rate at x in terms of the unknown intrinsic dimension in the vicinity of x. We also establish that the minimax rate does not depend on a particular choice of metric space or distribution, but rather that this minimax rate holds for any metric space and doubling measure.

ei

PDF Web [BibTex]

PDF Web [BibTex]


no image
Fast Newton-type Methods for Total-Variation with Applications

Barbero, A., Sra, S.

In Proceedings of the 28th International Conference on Machine Learning, ICML 2011, pages: 313-320, (Editors: L Getoor and T Scheffer), Omnipress, 28th International Conference on Machine Learning (ICML), 2011 (inproceedings)

ei

[BibTex]

[BibTex]


no image
Parallel Gibbs Sampling: From Colored Fields to Thin Junction Trees

Gonzalez, J., Low, Y., Gretton, A., Guestrin, C.

In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, Vol. 15, pages: 324-332, (Editors: G Gordon and D Dunson and M Dudík), JMLR, AISTATS, 2011 (inproceedings)

ei

PDF [BibTex]

PDF [BibTex]


Thumb xl problem
Recovering Intrinsic Images with a Global Sparsity Prior on Reflectance

Gehler, P., Rother, C., Kiefel, M., Zhang, L., Schölkopf, B.

In Advances in Neural Information Processing Systems 24, pages: 765-773, (Editors: Shawe-Taylor, John and Zemel, Richard S. and Bartlett, Peter L. and Pereira, Fernando C. N. and Weinberger, Kilian Q.), Curran Associates, Inc., Red Hook, NY, USA, Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS), 2011 (inproceedings)

Abstract
We address the challenging task of decoupling material properties from lighting properties given a single image. In the last two decades virtually all works have concentrated on exploiting edge information to address this problem. We take a different route by introducing a new prior on reflectance, that models reflectance values as being drawn from a sparse set of basis colors. This results in a Random Field model with global, latent variables (basis colors) and pixel-accurate output reflectance values. We show that without edge information high-quality results can be achieved, that are on par with methods exploiting this source of information. Finally, we are able to improve on state-of-the-art results by integrating edge information into our model. We believe that our new approach is an excellent starting point for future developments in this field.

ei ps

website + code pdf poster Project Page Project Page [BibTex]

website + code pdf poster Project Page Project Page [BibTex]

1996


no image
Incorporating invariances in support vector learning machines

Schölkopf, B., Burges, C., Vapnik, V.

In Artificial Neural Networks: ICANN 96, LNCS vol. 1112, pages: 47-52, (Editors: C von der Malsburg and W von Seelen and JC Vorbrüggen and B Sendhoff), Springer, Berlin, Germany, 6th International Conference on Artificial Neural Networks, July 1996, volume 1112 of Lecture Notes in Computer Science (inproceedings)

Abstract
Developed only recently, support vector learning machines achieve high generalization ability by minimizing a bound on the expected test error; however, so far there existed no way of adding knowledge about invariances of a classification problem at hand. We present a method of incorporating prior knowledge about transformation invariances by applying transformations to support vectors, the training examples most critical for determining the classification boundary.

ei

PDF DOI [BibTex]

1996


PDF DOI [BibTex]