TY - GEN
T1 - Towards expertise modelling for routing data cleaning tasks within a community of knowledge workers
AU - ul Hassan, Umair
AU - O’Riain, Sean
AU - Curry, Edward
N1 - Publisher Copyright:
© 2012 MIT. All rights reserved.
PY - 2012
Y1 - 2012
N2 - Applications consuming data have to deal with variety of data quality issues such as missing values, duplication, incorrect values, etc. Although automatic approaches can be utilized for data cleaning the results can remain uncertain. Therefore updates suggested by automatic data cleaning algorithms require further human verification. This paper presents an approach for generating tasks for uncertain updates and routing these tasks to appropriate workers based on their expertise. Specifically the paper tackles the problem of modelling the expertise of knowledge workers for the purpose of routing tasks within collaborative data quality management. The proposed expertise model represents the profile of a worker against a set of concepts describing the data. A simple routing algorithm is employed for leveraging the expertise profiles for matching data cleaning tasks with workers. The proposed approach is evaluated on a real world dataset using human workers. The results demonstrate the effectiveness of using concepts described the data for modelling expertise, in terms of likelihood of receiving responses to tasks routed to workers.
AB - Applications consuming data have to deal with variety of data quality issues such as missing values, duplication, incorrect values, etc. Although automatic approaches can be utilized for data cleaning the results can remain uncertain. Therefore updates suggested by automatic data cleaning algorithms require further human verification. This paper presents an approach for generating tasks for uncertain updates and routing these tasks to appropriate workers based on their expertise. Specifically the paper tackles the problem of modelling the expertise of knowledge workers for the purpose of routing tasks within collaborative data quality management. The proposed expertise model represents the profile of a worker against a set of concepts describing the data. A simple routing algorithm is employed for leveraging the expertise profiles for matching data cleaning tasks with workers. The proposed approach is evaluated on a real world dataset using human workers. The results demonstrate the effectiveness of using concepts described the data for modelling expertise, in terms of likelihood of receiving responses to tasks routed to workers.
KW - Crowd sourcing
KW - Data cleaning
KW - Linked data
KW - Web 2.0
UR - https://www.scopus.com/pages/publications/84910039863
M3 - Conference Publication
T3 - Proceedings of ICIQ 2012: 17th International Conference on Information Quality
SP - 58
EP - 69
BT - Proceedings of ICIQ 2012
A2 - Berti-Equille, Laure
A2 - Comyn-Wattiau, Isabelle
A2 - Scannapieco, Monica
PB - MIT
T2 - 17th International Conference on Information Quality, ICIQ 2012
Y2 - 16 November 2012 through 17 November 2012
ER -