Abstract
Analytics projects come in many forms, from large-scale multi-year projects to projects with small teams lasting just a few weeks. There is a particular type of analytics project identified by some unique challenges. A team is assembled for the purposes of the project and so team members have not worked together before. The project is short term so there is little opportunity to build capability. Work is often done on client systems requiring the use of limited and perhaps unfamiliar tools. Deadlines are daily or weekly and the requirements can shift repeatedly. Outputs produced in these circumstances will be subject to audit and an expectation of full reproducibility. These are guerrilla analytics projects. They necessitate a versatile and fast moving analytics team that can achieve quick analytics wins against a large data challenge using lightweight processes and tools. The unique challenges of guerrilla analytics necessitate a particular type of data analytics development process. This paper presents research in progress towards identifying a set of development principles for fast paced guerrilla analytics project environments. The paperâs principles cover 4 areas. Data Manipulation principles describe the environment and common services needed by a guerrilla analytics team. Data Provenance principles describe how data should be logged, separated and version controlled. Coding and Testing principles describe how code should be structured and outputs tested. All these principles focus on lightweight processes for overcoming the challenges of a guerrilla analytics project environment while meeting the guerrilla analytics requirement of auditability and reproducibility.
Original language | English (Ireland) |
---|---|
Title of host publication | Joint SIGDSS TUN Business Intelligence Congress 3: Driving Innovation through Big Data Analytics |
Publication status | Published - 1 Jan 2012 |
Authors (Note for portal: view the doc link for the full list of authors)
- Authors
- Ridge, Enda;Curry, Edward