<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel rdf:about="http://www.openstarts.units.it:80/dspace/handle/10077/4000">
    <title>DSpace Collection: Training and assessing classification rules with unbalanced data</title>
    <link>http://www.openstarts.units.it:80/dspace/handle/10077/4000</link>
    <description>Training and assessing classification rules with unbalanced data</description>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://www.openstarts.units.it:80/dspace/handle/10077/4002" />
      </rdf:Seq>
    </items>
    <dc:date>2013-05-25T23:46:24Z</dc:date>
  </channel>
  <item rdf:about="http://www.openstarts.units.it:80/dspace/handle/10077/4002">
    <title>Training and assessing classification rules with unbalanced data</title>
    <link>http://www.openstarts.units.it:80/dspace/handle/10077/4002</link>
    <description>Title: Training and assessing classification rules with unbalanced data
Authors: Menardi, Giovanna; Torelli, Nicola
Abstract: The problem of modeling binary responses by using cross-sectional data has been addressed&#xD;
with a number of satisfying solutions that draw on both parametric and nonparametric&#xD;
methods. However, there exist many real situations where one of the two responses (usually&#xD;
the most interesting for the analysis) is rare. It has been largely reported that this class&#xD;
imbalance heavily compromises the process of learning, because the model tends to focus on&#xD;
the prevalent class and to ignore the rare events. However, not only the estimation of the&#xD;
classification model is affected by a skewed distribution of the classes, but also the evaluation&#xD;
of its accuracy is jeopardized, because the scarcity of data leads to poor estimates of the&#xD;
model’s accuracy.&#xD;
In this work, the effects of class imbalance on model training and model assessing are&#xD;
discussed. Moreover, a unified and systematic framework for dealing with both the problems is proposed, based on a smoothed bootstrap re-sampling technique.
Type: Libro / capitolo</description>
    <dc:date>2010-01-01T00:00:00Z</dc:date>
  </item>
</rdf:RDF>

