# Statistical Learning and Kernel Methods in Bioinformatics

@inproceedings{Schlkopf2003StatisticalLA, title={Statistical Learning and Kernel Methods in Bioinformatics}, author={Bernhard Sch{\"o}lkopf and Isabelle Guyon and Jason Weston and Paolo Frasconi and Ron Shamir}, year={2003} }

We briefly describe the main ideas of statistical learning theory, support vector machines, and kernel feature spaces. In addition, we present an overview of applications of kernel methods in bioinformatics.1 1 An Introductory Example In this Section, we formalize the problem of pattern recognition as that of classifying objects called “pattern” into one of two classes. We introduce a simple pattern recognition algorithm that illustrates the mechanism of kernel methods. Suppose we are given… Expand

#### 55 Citations

LivingKnowledge: Kernel Methods for Relational Learning and Semantic Modeling

- Computer Science
- ISoLA
- 2010

KM is described, which is one of the most interesting results of statistical learning theory capable to abstract system design and make it simpler, and an example of effective use of KM for the design of a natural language application required in the European Project LivingKnowledge. Expand

Machine learning, statistical learning and the future of biological research in psychiatry

- Medicine, Computer Science
- Psychological Medicine
- 2016

This review first introduces the concept of Big Data in different environments, then describes how modern statistical learning models can be used in practice on Big Datasets to extract relevant information and discusses the strengths of using statistical learning in psychiatric studies. Expand

Online Kernel Selection: Algorithms and Evaluations

- Computer Science
- AAAI
- 2012

This paper proposes an efficient online kernel selection algorithm that incrementally learns a weight for each kernel classifier and has a theoretically guaranteed performance compared to the best kernel predictor. Expand

Gene feature selection

- 2004

This chapter presents an overview on the classes of methods available for feature selection, paying special attention to the problems typical to microarray data processing, where the number of… Expand

Empirical analysis of Bayesian kernel methods for modeling count data

- Mathematics
- 2014 Systems and Information Engineering Design Symposium (SIEDS)
- 2014

Bayesian models are used for estimation and forecasting in a wide range of application areas. One extension of such methods is the Bayesian kernel model, which integrate the Bayesian conjugate prior… Expand

Regularization as a Toolkit for Parsimonious Modeling in Bioinformatics

- 2008

With the tremendous progress in bioinformatics we are facing new challenges where complex data set need to be analyzed to glean meaningful scientific knowledge. Though statistical tools were slow to… Expand

Svm-based negative data mining to binary classification

- Computer Science
- 2006

Two approaches proposed in this paper for the binary classification enhance the useful data information by mining negative data by creating one or two additional hypothesis audit and booster to mine the negative examples output from the learner. Expand

Systematic benchmarking of microarray data classification: assessing the role of non-linearity and dimensionality reduction

- Mathematics, Computer Science
- Bioinform.
- 2004

A systematic benchmarking study comparing linear versions of standard classification and dimensionality reduction techniques with their non-linear versions based on non- linear kernel functions with a radial basis function (RBF) kernel finds that Kernel PCA with linear kernel gives better results. Expand

Spot Detection and Image Segmentation in DNA Microarray Data

- Biology, Medicine
- Applied bioinformatics
- 2005

An overview of state-of-the-art methods for microarray image segmentation is presented, discussing the foundations of the circle-shaped approach, adaptive shape segmentation, histogram-based methods and the recently introduced clustering-based techniques. Expand

Development of Two-Stage SVM-RFE Gene Selection Strategy for Microarray Expression Data Analysis

- Medicine, Computer Science
- IEEE/ACM Transactions on Computational Biology and Bioinformatics
- 2007

It is demonstrated that the two-stage SVM-RFE is significantly more accurate and more reliable than the SVM - Recursive Feature Elimination and three correlation-based methods based on the analysis of three publicly available microarray expression datasets. Expand

#### References

SHOWING 1-10 OF 56 REFERENCES

Dynamic Alignment Kernels

- 1999

There is much current interest in kernel methods for classi cation re gression PCA and other linear methods of data analysis Kernel methods may be particularly valuable for problems in which the… Expand

Support vector learning

- Computer Science
- 1997

This book provides a comprehensive analysis of what can be done using Support vector Machines, achieving record results in real-life pattern recognition problems, and proposes a new form of nonlinear Principal Component Analysis using Support Vector kernel techniques, which it is considered as the most natural and elegant way for generalization of classical Principal Component analysis. Expand

An introduction to Support Vector Machines

- Computer Science
- 2000

This book is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory. The book also introduces… Expand

Generalization performance of regularization networks and support vector machines via entropy numbers of compact operators

- Mathematics, Computer Science
- IEEE Trans. Inf. Theory
- 2001

New bounds for the generalization error of kernel machines, such as support vector machines and related regularization networks, are derived by obtaining new bounds on their covering numbers by using the eigenvalues of an integral operator induced by the kernel function used by the machine. Expand

Dimensionality Reduction via Sparse Support Vector Machines

- Mathematics, Computer Science
- J. Mach. Learn. Res.
- 2003

The method constructs a series of sparse linear SVMs to generate linear models that can generalize well, and uses a subset of nonzero weighted variables found by the linear models to produce a final nonlinear model. Expand

Support vector machine classification and validation of cancer tissue samples using microarray expression data

- Computer Science, Biology
- Bioinform.
- 2000

A new method to analyse tissue samples using support vector machines for mis-labeled or questionable tissue results and shows that other machine learning methods also perform comparably to the SVM on many of those datasets. Expand

Learning Gene Functional Classifications from Multiple Data Types

- Biology, Computer Science
- J. Comput. Biol.
- 2002

This work considers the problem of inferring gene functional classifications from a heterogeneous data set consisting of DNA microarray expression measurements and phylogenetic profiles from whole-genome sequence comparisons and proposes an SVM kernel function that is explicitly heterogeneous. Expand

An improved training algorithm for support vector machines

- Computer Science
- Neural Networks for Signal Processing VII. Proceedings of the 1997 IEEE Signal Processing Society Workshop
- 1997

This paper presents a decomposition algorithm that is guaranteed to solve the QP problem and that does not make assumptions on the expected number of support vectors. Expand

Convolution kernels on discrete structures

- Computer Science
- 1999

We introduce a new method of constructing kernels on sets whose elements are discrete structures like strings, trees and graphs. The method can be applied iteratively to build a kernel on a innnite… Expand

The Spectrum Kernel: A String Kernel for SVM Protein Classification

- Mathematics, Computer Science
- Pacific Symposium on Biocomputing
- 2002

A new sequence-similarity kernel, the spectrum kernel, is introduced for use with support vector machines (SVMs) in a discriminative approach to the protein classification problem and performs well in comparison with state-of-the-art methods for homology detection. Expand