TY - GEN
T1 - Analysis of the effect of unexpected outliers in the classification of spectroscopy data
AU - Glavin, Frank G.
AU - Madden, Michael G.
PY - 2010
Y1 - 2010
N2 - Multi-class classification algorithms are very widely used, but we argue that they are not always ideal from a theoretical perspective, because they assume all classes are characterized by the data, whereas in many applications, training data for some classes may be entirely absent, rare, or statistically unrepresentative. We evaluate one-sided classifiers as an alternative, since they assume that only one class (the target) is well characterized. We consider a task of identifying whether a substance contains a chlorinated solvent, based on its chemical spectrum. For this application, it is not really feasible to collect a statistically representative set of outliers, since that group may contain anything apart from the target chlorinated solvents. Using a new one-sided classification toolkit, we compare a One-Sided k-NN algorithm with two well-known binary classification algorithms, and conclude that the one-sided classifier is more robust to unexpected outliers.
AB - Multi-class classification algorithms are very widely used, but we argue that they are not always ideal from a theoretical perspective, because they assume all classes are characterized by the data, whereas in many applications, training data for some classes may be entirely absent, rare, or statistically unrepresentative. We evaluate one-sided classifiers as an alternative, since they assume that only one class (the target) is well characterized. We consider a task of identifying whether a substance contains a chlorinated solvent, based on its chemical spectrum. For this application, it is not really feasible to collect a statistically representative set of outliers, since that group may contain anything apart from the target chlorinated solvents. Using a new one-sided classification toolkit, we compare a One-Sided k-NN algorithm with two well-known binary classification algorithms, and conclude that the one-sided classifier is more robust to unexpected outliers.
KW - Classification
KW - k-Nearest Neighbour
KW - One-Class
KW - One-Sided
KW - Spectroscopy Analysis
KW - Support Vector Machine
UR - https://www.scopus.com/pages/publications/78650122740
U2 - 10.1007/978-3-642-17080-5_15
DO - 10.1007/978-3-642-17080-5_15
M3 - Conference Publication
AN - SCOPUS:78650122740
SN - 364217079X
SN - 9783642170799
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 124
EP - 133
BT - Artificial Intelligence and Cognitive Science - 20th Irish Conference, AICS 2009, Revised Selected Papers
T2 - 20th Annual Irish Conference on Artificial Intelligence and Cognitive Science, AICS 2009
Y2 - 19 August 2009 through 21 August 2009
ER -