343 | IGPM343.pdf Juli 2012 |
TITLE | Classification Algorithms using Adaptive Partitioning |
AUTHORS | Peter Binev, Albert Cohen, Wolfgang Dahmen and Ronald DeVore |
ABSTRACT | Algorithms for binary classification based on adaptive partitioning are formulated and analyzed for both their risk performance and their friendliness to numerical implementation. The algorithms can be viewed as generating a set approximation to the Bayes set and thus fall into the general category of set estimators. A general theory is developed to analyze the risk performance of set estimators with the goal of guaranteeing performance with high probability rather than in expectation. This analysis decouples the approximation and variance effects on the risk. Bounds are given for the variance in terms of VC dimension and margin conditions by introducing a new modulus and studying its relation to margin conditions. Bounds are given for the approximation term based on the smoothness of the regression function and margin conditions. When these approximation results are used with the variance bounds, an estimate of risk performance is obtained. A simple model selection is used to optimally balance the approximation and variance bounds. This general theory is then applied to the adaptive algorithms and results are formulated for the risk performance of these algorithms in terms of Besov smoothness of the regression function and margin conditions. The results of this paper are related to the work of Scott and Nowak [14] on tree based adaptive methods for classification, however with several important distinctions. In particular, our model selection utilizes a validation sample to avoid identifying suitable penalty terms. This allows us to employ wedge decorated trees that yield higher order performance. |
KEYWORDS | binary classification, adaptive methods, set estimators, tree based algorithms, analysis of risk |
DOI | 10.1214/14-AOS1234 |
PUBLICATION | The Annals of Statistics Volume 42, Number 6 (2014), pp 2141-2163 |