An axiomatic comparison of learned term-weighting schemes in information retrieval: Clarifications and extensions

Ronan Cummins, Colm O'Riordan

Research output: Contribution to a Journal (Peer & Non Peer)Articlepeer-review

23 Citations (Scopus)

Abstract

Machine learning approaches to information retrieval are becoming increasingly widespread. In this paper, we present term-weighting functions reported in the literature that were developed by four separate approaches using genetic programming. Recently, a number of axioms (constraints), from which all good term-weighting schemes should be deduced, have been developed and shown to be theoretically and empirically sound. We introduce a new axiom and empirically validate it by modifying the standard BM25 scheme. Furthermore, we analyse the BM25 scheme and the four learned schemes presented to determine if the schemes are consistent with the axioms. We find that one learned term-weighting approach is consistent with more axioms than any of the other schemes. An empirical evaluation of the schemes on various test collections and query lengths shows that the scheme that is consistent with more of the axioms outperforms the other schemes.

Original languageEnglish
Pages (from-to)51-68
Number of pages18
JournalArtificial Intelligence Review
Volume28
Issue number1 SPEC. ISS.
DOIs
Publication statusPublished - Jun 2007

Keywords

  • Axiomatic constraints
  • Genetic programming
  • Information retrieval

Fingerprint

Dive into the research topics of 'An axiomatic comparison of learned term-weighting schemes in information retrieval: Clarifications and extensions'. Together they form a unique fingerprint.

Cite this