Large Scale Kernel Regression via Linear Programming
Abstract
The problem of tolerant data tting by a nonlinear surface, in-
duced by a kernel-based support vector machine [24], is formulated as
a linear program with fewer number of variables than that of other
linear programming formulations [21]. A generalization of the lin-
ear programming chunking algorithm [2] for arbitrary kernels [13] is
implemented for solving problems with very large datasets wherein
chunking is performed on both data points and problem variables. The
proposed approach tolerates a small error, which is adjusted paramet-
rically, while tting the given data. This leads to improved tting of
noisy data (over ordinary least error solutions) as demonstrated com-
putationally. Comparative numerical results indicate an average time
reduction as high as 26.0% over other formulations, with a maximal
time reduction of 79.7%. Additionally, linear programs with as many
as 16,000 data points and more than a billion nonzero matrix elements
are solved.
Subject
linear programming
support vector machines
kernel regression
Permanent Link
http://digital.library.wisc.edu/1793/64272Citation
99-02