Diagnosis of data quality issues in BPI challenge data sets

Study level

Vacation research experience scheme

Topic status

We're looking for students to study this topic.


Dr Fahame Emamjome
Associate Lecturer
Division / Faculty
Science and Engineering Faculty


Process mining aims to derive information from historical behaviour of processes in organisations which has been recorded in event logs. However, event data and event log quality are critical success factors for process mining projects. The logged event data generally requires significant manipulation, or data cleaning, to convert the raw data to an event log suitable for use in a process mining analysis.

While data cleaning in is one of the main stages in various process mining methodologies, data quality is generally poorly defined in these methodologies. Process mining analysts usually deal with post quality symptoms in their data rather than understanding how the quality issues are created and how they may affect the results of their analysis.

We believe that an approach that truly identifies the root causes of event log quality issues serves process mining research better than approaches that deal with quality issues/symptoms in event logs after the fact.

Research activities

The research team has developed a theoretical framework which can guide the process mining analysts to understand data quality issues and take a diagnostic approach to discover root causes of these problems.

In this project you will have the opportunity to apply this theoretical framework to real world data, such as business process intelligence (BPI) data and discovering the root causes of the quality issues in this data.


As deliverable, we expect a report including:

  • analysis of data quality issues in the data set
  • explaining a step by step approach in applying the theoretical framework to the context
  • sets of propositions linked to data quality issues with convincing reasoning.

Skills and experience

For this project, we expect you to have a basic understanding of process models, event logs and process mining (IFN515 or equivalent).

Please keep in mind that you will need to make yourself familiar with data quality issues in process mining as the first step in this project.



Contact the supervisor for more information.