Capturing interactive data transformation operations using provenance workflows

Tope Omitola, André Freitas, Edward Curry, Séan O’Riain, Nicholas Gibbins, Nigel Shadbolt

    Research output: Chapter in Book or Conference Publication/ProceedingConference Publicationpeer-review

    Abstract

    The ready availability of data is leading to the increased opportunity of their re-use for new applications and for analyses. Most of these data are not necessarily in the format users want, are usually heterogeneous, and highly dynamic, and this necessitates data transformation efforts to re-purpose them. Interactive data transformation (IDT) tools are becoming easily available to lower these barriers to data transformation efforts. This paper describes a principled way to capture data lineage of interactive data transformation processes. We provide a formal model of IDT, its mapping to a provenance representation, and its implementation and validation on Google Refine. Provision of the data transformation process sequences allows assessment of data quality and ensures portability between IDT and other data transformation platforms. The proposed model showed a high level of coverage against a set of requirements used for evaluating systems that provide provenance management solutions.

    Original languageEnglish
    Title of host publicationThe Semantic Web
    Subtitle of host publicationESWC 2012 Satellite Events - Revised Selected Papers
    EditorsAlexandre Passant, Barry Norton, Emanuele Della Valle, Raphael Troncy, Irini Fundulaki, Elena Simperl, Dunja Mladenic
    PublisherSpringer-Verlag
    Pages29-42
    Number of pages14
    ISBN (Print)9783662466407
    DOIs
    Publication statusPublished - 2015
    EventExtended Semantic Web Conference, ESWC 2012 - Heraklion, Greece
    Duration: 27 May 201231 May 2012

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume7540
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    ConferenceExtended Semantic Web Conference, ESWC 2012
    Country/TerritoryGreece
    CityHeraklion
    Period27/05/1231/05/12

    Keywords

    • Data consumption
    • Data publication
    • Extract-Transform-Load
    • Interactive data transformation
    • Linked Data
    • Provenance
    • Public open data
    • Semantic Web
    • Workflow

    Fingerprint

    Dive into the research topics of 'Capturing interactive data transformation operations using provenance workflows'. Together they form a unique fingerprint.

    Cite this