Unexpected properties of bandwidth choice when smoothing discrete data for constructing a functional data classifier
KAUST Grant NumberKUS-CI-016-04
MetadataShow full item record
AbstractThe data functions that are studied in the course of functional data analysis are assembled from discrete data, and the level of smoothing that is used is generally that which is appropriate for accurate approximation of the conceptually smooth functions that were not actually observed. Existing literature shows that this approach is effective, and even optimal, when using functional data methods for prediction or hypothesis testing. However, in the present paper we show that this approach is not effective in classification problems. There a useful rule of thumb is that undersmoothing is often desirable, but there are several surprising qualifications to that approach. First, the effect of smoothing the training data can be more significant than that of smoothing the new data set to be classified; second, undersmoothing is not always the right approach, and in fact in some cases using a relatively large bandwidth can be more effective; and third, these perverse results are the consequence of very unusual properties of error rates, expressed as functions of smoothing parameters. For example, the orders of magnitude of optimal smoothing parameter choices depend on the signs and sizes of terms in an expansion of error rate, and those signs and sizes can vary dramatically from one setting to another, even for the same classifier.
CitationCarroll RJ, Delaigle A, Hall P (2013) Unexpected properties of bandwidth choice when smoothing discrete data for constructing a functional data classifier. The Annals of Statistics 41: 2739–2767. Available: http://dx.doi.org/10.1214/13-AOS1158.
SponsorsSupported by a Grant from the National Cancer Institute (R37-CA057030). This publication is based in part on work supported by Award Number KUS-CI-016-04, made by King Abdullah University of Science and Technology (KAUST).Supported by grants and fellowships from the Australian Research Council.
PublisherInstitute of Mathematical Statistics
JournalThe Annals of Statistics
PubMed Central IDPMC4191932
CollectionsPublications Acknowledging KAUST Support
- Double-smoothing in kernel hazard rate estimation.
- Authors: Weibbach R, Pfahlberg A, Gefeller O
- Issue date: 2008
- Evaluation of threshold selection methods for adaptive kernel density estimation in disease mapping.
- Authors: Ruckthongsook W, Tiwari C, Oppong JR, Natesan P
- Issue date: 2018 May 8
- Estimation of the maximum flow-mediated brachial artery response using local regression methods.
- Authors: Andrew ME, Li S, Fekedulegn D, Dorn J, Joseph PN, Violanti J, Burchfiel CM
- Issue date: 2007 Oct
- Comparison of bandwidth selection methods for kernel smoothing of ROC curves.
- Authors: Zhou XH, Harezlak J
- Issue date: 2002 Jul 30
- Comparing Smoothing Techniques for Fitting the Nonlinear Effect of Covariate in Cox Models.
- Authors: Roshani D, Ghaderi E
- Issue date: 2016 Feb