“Lies, damned lies, and statistics”. The preceding well-known phrase captures the, possibly unacknowledged, dilemma faced by organisations who are increasingly reliant on the power of data analytics to make important decisions. ‘Big data analytics’ has demonstrated its ability to deliver positive outcomes to organisations, from exposing and (dis-)proving anecdotally-ridden wisdom (or myths) to delivering targeted and effective data-driven improvement recommendations to business operations. However, drawing parallels to the practices of ‘creative accounting’, insights extracted from data analytics can potentially be crafted and moulded (e.g. through somewhat routine and well-accepted practices of `data cleaning’ and `data filtering’) to deliver certain `insights’ that may be misleading.
Extracting the ultimate ‘truth’ from data may be wishful thinking; however, checks and balances can and should be applied to data analytics practices in order to deliver results whose validity can be guaranteed to a certain extent. While current practices apply, to some extent, verification procedures (such as cross-validation techniques), they are often done in an ad-hoc manner and often without external validation.
Inspired by recent uptake of blockchain technology, this project seeks to explore, design, experiment, and evaluate the effectiveness of including a robust multi-party ‘consensus’ mechanism to disable potential deceptive data analytics practices.
Research activities in this project vary, depending on students’ skills and duration of the project. In general, research activities for this project include:
- A literature review of the state-of-the-art techniques to guarantee the correctness of data analytics and/or various consensus mechanisms within blockchain (VRES)
- Implementation and evaluation of the suitability of various blockchain platforms to support data analytics (VRES, Honours)
- The development of new metrics to characterise data transformations throughout various stages of data analysis (PhD)
- The development of new consensus mechanisms customised for data analytics purposes (PhD)
The expected project outcomes are dependent on the scope of the project that students’ undertake. Key outcomes include:
- Gap analysis in the techniques for the purpose of asserting the correctness of data analytics results
- Insights into the (un-)suitability of various blockchain platforms for data analytics purposes
- New metrics that can comprehensively characterise the ‘truth distance’ between original data set and the ‘cleaned’ data set used for analysis
- New blockchain consensus mechanisms that are specific for verifying data analytics results.
Skills and experience
- Familiarity with the fields of data mining, data science, process mining, and/or information security
- Reasonable writing skills
- Problem-solving and logical thinking capabilities
- Computer programming skills
You may be able to apply for a research scholarship in our annual scholarship round.
- data quality
- data analytics
- big data
- data science
- information security
- distributed ledger
Contact the supervisor for more information.