TY - GEN
T1 - Optimisation of Cancer Status Prediction Pipelines using Bio-Inspired Computing
AU - Barbachan e Silva, Mariel
AU - Narloch, Pedro Henrique
AU - Dorn, Marcio
AU - Broin, Pilib
N1 - Publisher Copyright:
© 2021 IEEE
PY - 2021
Y1 - 2021
N2 - Cancer is one of the leading causes of death globally, and early detection is a fundamental factor in improving patient outcomes. The advent of high-throughput genetic profiling techniques in the last few decades has led to an explosion of genetic data related to cancer. Machine learning methods, and classification algorithms in particular, have been used to find underlying patterns in cancer data and make diagnostic predictions. The addition of feature selection to classification pipelines can lead to improvements in predictive capabilities, since the removal of non-important features benefits the construction of classification models. We developed a classification pipeline for cancer status prediction composed of a feature selection step with SelectKBest and an ensemble classifier system with five popular supervised learning algorithms. We used three bio-inspired optimization techniques to select the optimal sets of hyperparameters for the classification pipeline and compared these approaches on three cancer microarray datasets. The results indicate that the optimized pipelines have better predictive performance in all but one of the experiments compared to the ensemble alone.
AB - Cancer is one of the leading causes of death globally, and early detection is a fundamental factor in improving patient outcomes. The advent of high-throughput genetic profiling techniques in the last few decades has led to an explosion of genetic data related to cancer. Machine learning methods, and classification algorithms in particular, have been used to find underlying patterns in cancer data and make diagnostic predictions. The addition of feature selection to classification pipelines can lead to improvements in predictive capabilities, since the removal of non-important features benefits the construction of classification models. We developed a classification pipeline for cancer status prediction composed of a feature selection step with SelectKBest and an ensemble classifier system with five popular supervised learning algorithms. We used three bio-inspired optimization techniques to select the optimal sets of hyperparameters for the classification pipeline and compared these approaches on three cancer microarray datasets. The results indicate that the optimized pipelines have better predictive performance in all but one of the experiments compared to the ensemble alone.
KW - Cancer
KW - Evolutionary algorithm
KW - Hyperparameter optimization
KW - Machine learning
KW - Swarm intelligence
UR - http://www.scopus.com/inward/record.url?scp=85124607867&partnerID=8YFLogxK
U2 - 10.1109/CEC45853.2021.9504911
DO - 10.1109/CEC45853.2021.9504911
M3 - Conference Publication
AN - SCOPUS:85124607867
T3 - 2021 IEEE Congress on Evolutionary Computation, CEC 2021 - Proceedings
SP - 442
EP - 449
BT - 2021 IEEE Congress on Evolutionary Computation, CEC 2021 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE Congress on Evolutionary Computation, CEC 2021
Y2 - 28 June 2021 through 1 July 2021
ER -