Abstract
We describe a case study, which shows how important neg- ative results are in uncovering biased evaluation methodologies. Our re- search question is how to compare a recommender algorithm that uses an RDF graph to a recommendation algorithm that uses rating data. Our case study uses DBpedia 3.8 and the MovieLens 100k data set. We show that the most popular evaluation protocol in the recommender sys- tems literature is biased towards evaluating collaborative filtering (CF) algorithms, as it uses the rating prediction task. Based on the negative results of this first experiment, we find an alternative evaluation task, the top-k recommendation task. While this task is harder to perform, our positive results show that it is a much better fit, which is not biased to- wards either CF or our graph-based algorithm. The second set of results are statistically significant (Wilcoxon rank sum test, p 0.01).
Original language | English (Ireland) |
---|---|
Title of host publication | ESWC 2015, NoISE 2015, Workshop on Negative or Inconclusive Results in Semantic Web |
Place of Publication | Portoroz, Slovenia |
DOIs | |
Publication status | Published - 1 Jun 2015 |
Authors (Note for portal: view the doc link for the full list of authors)
- Authors
- Heitmann, Benjamin and Hayes, Conor