A framework for the study of evolved term-weighting schemes in information retrieval

Ronan Cummins, Colm O'Riordan

Research output: Contribution to a Journal (Peer & Non Peer)Conference articlepeer-review

Abstract

Evolutionary algorithms and, in particular, Genetic Programming (GP) are increasingly being applied to the problem of evolving term-weighting schemes in Information Retrieval (IR). One fundamental problem with the solutions generated by these stochastic processes is that they are often difficult to analyse. A number of questions regarding these evolved term-weighting schemes remain unanswered. One interesting question is; do different runs of the GP process bring us to similar points in the solution space? This paper deals with determining a number of measures of the distance between the ranked lists (phenotype) returned by different term-weighting schemes. Using these distance measures, we develop trees that show the phenotypic distance between these termweighting schemes. This framework gives us a representation of where these evolved solutions lie in the solution space. Finally, we evolve several global term-weighting schemes and show that this framework is indeed useful for determining the relative closeness of these schemes and for determining the expected performance on general test data.

Original languageEnglish
Pages (from-to)6-11
Number of pages6
JournalCEUR Workshop Proceedings
Publication statusPublished - 2006
EventECAI 2006 3rd International Workshop on Text-Based Information Retrieval, TIR 2006 - Riva del Garda, Italy
Duration: 29 Aug 200629 Aug 2006

Fingerprint

Dive into the research topics of 'A framework for the study of evolved term-weighting schemes in information retrieval'. Together they form a unique fingerprint.

Cite this