Gradients Weights Improve Regression and Classification

Samory Kpotufe, Abdeslam Boularias, Thomas Schultz, and Kyoungok Kim
In: Journal of Machine Learning Research (2016), 17:22(1-34)
 

Abstract

In regression problems over \mathbbRd, the unknown function f often varies more in some coordinates than in others. We show that weighting each coordinate i according to an estimate of the variation of f along coordinate i – e.g. the L1 norm of the ith-directional derivative of f – is an efficient way to significantly improve the performance of distance-based regressors such as kernel and k-NN regressors. The approach, termed Gradient Weighting (GW), consists of a first pass regression estimate fn which serves to evaluate the directional derivatives of f, and a second-pass regression estimate on the re-weighted data. The GW approach can be instantiated for both regression and classification, and is grounded in strong theoretical principles having to do with the way regression bias and variance are affected by a generic feature-weighting scheme. These theoretical principles provide further technical foundation for some existing feature-weighting heuristics that have proved successful in practice. We propose a simple estimator of these derivative norms and prove its consistency. The proposed estimator computes efficiently and easily extends to run online. We then derive a classification version of the GW approach which evaluates on real-worlds datasets with as much success as its regression counterpart.

Images

Bibtex

@ARTICLE{KpotufeJMLR2016,
    author = {Kpotufe, Samory and Boularias, Abdeslam and Schultz, Thomas and Kim, Kyoungok},
     pages = {1--34},
     title = {Gradients Weights Improve Regression and Classification},
   journal = {Journal of Machine Learning Research},
    volume = {17},
    number = {22},
      year = {2016},
  abstract = {In regression problems over $\mathbb{R}^d$, the unknown function $f$ often varies more in some
              coordinates than in others. We show that weighting each coordinate $i$ according to an estimate of
              the variation of $f$ along coordinate $i$ -- e.g. the $L_1$ norm of the $i$th-directional derivative
              of $f$ -- is an efficient way to significantly improve the performance of distance-based regressors
              such as kernel and $k$-NN regressors. The approach, termed Gradient Weighting (GW), consists of a
              first pass regression estimate $f_n$ which serves to evaluate the directional derivatives of $f$,
              and a second-pass regression estimate on the re-weighted data. The GW approach can be instantiated
              for both regression and classification, and is grounded in strong theoretical principles having to
              do with the way regression bias and variance are affected by a generic feature-weighting scheme.
              These theoretical principles provide further technical foundation for some existing
              feature-weighting heuristics that have proved successful in practice. We propose a simple estimator
              of these derivative norms and prove its consistency. The proposed estimator computes efficiently and
              easily extends to run online. We then derive a classification version of the GW approach which
              evaluates on real-worlds datasets with as much success as its regression counterpart.}
}