Using machine learning to distinguish infected from non-infected subjects at an early stage based on viral inoculation

Research output: Chapter in Book or Conference Publication/ProceedingConference Publicationpeer-review

3 Citations (Scopus)

Abstract

Gene expression profiles help to capture the functional state in the body and to determine dysfunctional conditions in individuals. In principle, respiratory and other viral infections can be judged from blood samples; however, it has not yet been determined which genetic expression levels are predictive, in particular for the early transition states of the disease onset. For these reasons, we analyse the expression levels of infected and non-infected individuals to determine genes (potential biomarkers) which are active during the progression of the disease. We use machine learning (ML) classification algorithms to determine the state of respiratory viral infections in humans exploiting time-dependent gene expression measurements; the study comprises four respiratory viruses (H1N1, H3N2, RSV, and HRV), seven distinct clinical studies and 104 healthy test candidates involved overall. From the overall set of 12,023 genes, we identified the 10 top-ranked genes which proved to be most discriminatory with regards to prediction of the infection state. Our two models focus on the time stamp nearest to ''t = 48'' hours and nearest to t = “Onset Time” denoting the symptom onset (at different time points) according to the candidate’s specific immune system response to the viral infection. We evaluated algorithms including k-Nearest Neighbour (k-NN), Random Forest, linear Support Vector Machine (SVM), and SVM with radial basis function (RBF) kernel, in order to classify whether the gene expression sample collected at early time point t is infected or not infected. The “Onset Time” appears to play a vital role in prediction and identification of ten most discriminatory genes.

Original languageEnglish
Title of host publicationData Integration in the Life Sciences - 13th International Conference, DILS 2018, Proceedings
EditorsMaria-Esther Vidal, Sören Auer
PublisherSpringer-Verlag
Pages105-121
Number of pages17
ISBN (Print)9783030060152
DOIs
Publication statusPublished - 2019
Event13th International Conference on Data Integration in the Life Sciences, DILS 2018 - Hannover, Germany
Duration: 20 Nov 201821 Nov 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11371 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference13th International Conference on Data Integration in the Life Sciences, DILS 2018
Country/TerritoryGermany
CityHannover
Period20/11/1821/11/18

Keywords

  • Deferentially expressed genes
  • Machine learning
  • Prediction
  • Respiratory viral infection

Fingerprint

Dive into the research topics of 'Using machine learning to distinguish infected from non-infected subjects at an early stage based on viral inoculation'. Together they form a unique fingerprint.

Cite this