Improved Design of Quadratic Discriminant Analysis Classi er in Unbalanced Settings
Permanent link to this recordhttp://hdl.handle.net/10754/662639
MetadataShow full item record
AbstractThe use of quadratic discriminant analysis (QDA) or its regularized version (RQDA) for classi cation is often not recommended, due to its well-acknowledged high sensitivity to the estimation noise of the covariance matrix. This becomes all the more the case in unbalanced data settings for which it has been found that R-QDA becomes equivalent to the classi er that assigns all observations to the same class. In this paper, we propose an improved R-QDA that is based on the use of two regularization parameters and a modi ed bias, properly chosen to avoid inappropriate behaviors of R-QDA in unbalanced settings and to ensure the best possible classi cation performance. The design of the proposed classi er builds on a re ned asymptotic analysis of its performance when the number of samples and that of features grow large simultaneously, which allows to cope e ciently with the high-dimensionality frequently met within the big data paradigm. The performance of the proposed classi er is assessed on both real and synthetic data sets and was shown to be much higher than what one would expect from a traditional R-QDA.