Skip to content

Process data analytics: quality-informed and responsible process mining

Study level

PhD

Master of Philosophy

Honours

Vacation research experience scheme

Faculty/Lead unit

Science and Engineering Faculty

School of Information Systems

Topic status

We're looking for students to study this topic.

Supervisors

Associate Professor Moe Wynn
Position
Associate Professor
Division / Faculty
Science and Engineering Faculty

Overview

Modern organisations consider data to be the lifeblood of their organisations. Technological advances in the fields of Business Intelligence (BI) and Data Science empower organisations to become ‘data-driven’ by applying new techniques to analyse and visualise large amounts of data. The potential benefits include a better understanding of business performance and more-informed decision making for business growth.

A key road block to this vision is the lack of transparency surrounding the quality of data. Significant data quality issues persist and hinder analysis capabilities, as evidenced by observations from leading analysts including Bill Hostmann, a distinguished analyst at Gartner Inc ('The biggest challenge in BI is the data quality') and Boris Evelson, principal analyst at Forrester Research Inc. ('Data quality today is just as bad as 30 years ago'). (http://goo.gl/7qlKqS) To harness the power of data for real-time insights and predictive analytics, 'self-service data prep' where users are empowered to judge and prepare the data for analysis is considered as one of the top ten trends in BI for 2017 (Tableau, 2017).

Process Mining is a specialised form of data-driven process analytics where process data, collated from the different IT systems typically available in organisations, is analysed to uncover the real behaviour and performance of business operations. Process mining has been applied in over 100 organisations worldwide (van der Aalst, 2011). In Australia, process mining techniques have been used to analyse the behaviour of processes in the healthcare, insurance, retail and telecommunication domains. Such data-driven process analysis techniques are becoming more commonly supported in both open-source and commercial process mining and business process intelligence tools.

Without question, the extent to which the outcomes from process mining can be relied upon for insights is directly related to the quality of the input. A process mining study that utilises low-quality, unrepresentative data as input has little or no value for the organisation and becomes a catalyst for erroneous conclusions. This puts an enormous burden on a process analyst to get the data “just right” before conducting a process mining analysis. The analyst must balance the need to address significant data quality issues against errors introduced by “tampering” with the data. In the era of “big data” where the amount of data that might need to be analysed is growing exponentially (Cai and Zhu, 2015), automated methods of cleaning and managing the quality of process data are required.
Most currently available process mining techniques are quality-agnostic, i.e., they analyse the input data without taking into account how the recorded data has been manipulated or pre-processed beforehand. Combined with the ad-hoc and manual manner in which the original data is manipulated, it is impossible to keep track of the impact on the analysis outcomes of the original data quality and subsequent data manipulations. Thus, new process mining techniques that can discover reliable process insights with an explicit degree of confidence are required.

Research activities

Prospective students will work closely with a team of researchers from the BPM discipline to address key research challenges identified in this research topic. Depending on the nature of the research study being undertaken, the scope of the research project will be adjusted.

For instance, a VRES student will assist the research team by analyzing real-life data sets; evaluating research software prototypes; and conducting user studies with industry partners.

A research student (Honours, Masters, PhD) may develop algorithms that can compute data quality metrics for various quality dimensions and generate quality-annotated event logs from industry data sets. The student then implement them as plug-ins within the open-source process mining framework, ProM.

Alternatively, after a literature review of process mining techniques and a requirements analysis, a research student may design and develop algorithms related to process discovery, process conformance and process performance analysis to utilise knowledge about the quality of an event log. Empirical evaluation will then take place via user studies with industry partners.

Research project on process data provenance

After a literature review on data mining and process mining techniques, the PhD student will carry out research on patterns-based
detection and remediation of data imperfections in an event log. The student will:

  • design data quality detection and remediation algorithms,
  • develop algorithms that can compute data quality metrics for various quality dimensions and generate quality-annotated event logs from industry data sets,
  • validate proposed data quality approaches via user studies with Australian industry partners in insurance and banking domains,
  • implement these techniques in open-source software, e.g. the ProM platform,
  • conduct the evaluation of the resulting framework with synthetic and real-world logs.
Research project on quality-informed process mining

After a literature review on data mining and process mining techniques, the PhD student will carry out research on the design, implementation and evaluation of new algorithms for quality-informed process mining. The student will:

  • design and develop algorithms related to process discovery, process conformance and process performance analysis which utilise
  • knowledge about the quality of an event log,
  • develop new techniques that make transparent the impact of remediation actions on the analysis results,
  • implement these techniques in open-source software, e.g. the ProM platform,
  • conduct the evaluation of the resulting framework with synthetic and real-world logs in conjunction with industry partners.

Skills and experience

The candidate for the PhD position should hold an Honour’s degree in computer science, information technology or equivalent, has programming experience in Java, has an interest in business process management, and be familiar with IT development practices.

Scholarships

You may be able to apply for a research scholarship in our annual scholarship round.

Annual scholarship round

Keywords

Contact

Contact the supervisor for more information.