|
OpenstarTs >
EUT-Libri >
Collane >
Working Paper Series - Dipartimento di scienze economiche, aziendali, matematiche e statistiche "Bruno de Finetti" >
Working Papers Series 2010, 2 >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/10077/4002
|
| Title: | Training and assessing classification rules with unbalanced data |
| Authors: | Menardi, Giovanna Torelli, Nicola |
| Keywords: | accuracy binary classification bootstrap kernel density estimation unbalanced learning |
| Issue Date: | 2010 |
| Publisher: | EUT Edizioni Università di Trieste |
| Citation: | Giovanna Menardi, Nicola Torelli, "Training and assessing classification rules with unbalanced data", Working Paper Series, N. 2, 2010. |
| Series/Report no.: | Working paper series - Dipartimento di scienze economiche, aziendali, matematiche e statistiche "Bruno de Finetti" 2 (2010) |
| Abstract: | The problem of modeling binary responses by using cross-sectional data has been addressed
with a number of satisfying solutions that draw on both parametric and nonparametric
methods. However, there exist many real situations where one of the two responses (usually
the most interesting for the analysis) is rare. It has been largely reported that this class
imbalance heavily compromises the process of learning, because the model tends to focus on
the prevalent class and to ignore the rare events. However, not only the estimation of the
classification model is affected by a skewed distribution of the classes, but also the evaluation
of its accuracy is jeopardized, because the scarcity of data leads to poor estimates of the
model’s accuracy.
In this work, the effects of class imbalance on model training and model assessing are
discussed. Moreover, a unified and systematic framework for dealing with both the problems is proposed, based on a smoothed bootstrap re-sampling technique. |
| URI: | http://hdl.handle.net/10077/4002 |
| ISBN: | 978-88-8303-321-6 |
| Appears in Collections: | Working Papers Series 2010, 2
|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
|