RSVM: Reduced Support Vector Machines
Abstract
An algorithm is proposed which generates a nonlinear kernel-based
separating surface that requires as little as 1% of a large dataset for its explicit
evaluation. To generate this nonlinear surface, the entire dataset is used as a constraint
in an optimization problem with very few variables corresponding to the 1%
of the data kept. The remainder of the data can be thrown away after solving the
optimization problem. This is achieved by making use of a rectangular m m kernel
K(A;A 0) that greatly reduces the size of the quadratic program to be solved and
simpli es the characterization of the nonlinear separating surface. Here, the m rows
of A represent the original m data points while the m rows of A represent a greatly
reduced m data points. Computational results indicate that test set correctness for
the reduced support vector machine (RSVM), with a nonlinear separating surface
that depends on a small randomly selected portion of the dataset, is better than
that of a conventional support vector machine (SVM) with a nonlinear surface that
explicitly depends on the entire dataset, and much better than a conventional SVM
using a small random sample of the data. Computational times, as well as memory
usage, are much smaller for RSVM than that of a conventional SVM using the entire
dataset.
Subject
support vector machines
Permanent Link
http://digital.library.wisc.edu/1793/64290Citation
00-07