Study level

  • PhD
  • Master of Philosophy
  • Honours
  • Vacation research experience scheme

Faculty/School

Faculty of Science

School of Information Systems

Topic status

We're looking for students to study this topic.

Research centre

Supervisors

Dr Sareh Sadeghianasl
Position
Lecturer in Information Systems
Division / Faculty
Faculty of Science
Professor Arthur ter Hofstede
Position
Principal Research Fellow
Division / Faculty
Faculty of Science

Overview

Praeclarus is an open-source software framework that aims to facilitate data pre-processing for process mining. Process mining is specialised data mining focusing on process-data. It is of high interest to industry, with the market doubling every two years (e.g., increasing from $550M in 2020 to $1B in 2022). This market increase has meant that big companies like Microsoft, SAP, and IBM are acquiring process mining vendors such is Minit, Signavio, and myInvenio.

Recent process mining surveys show that more than 60% of the time and effort is spent on data transformation and pre-processing. These steps include uploading the data sets in different formats, the detection of quality issues, and their repair. This project aims to address different aspects of the development of the Praeclarus framework.

Research activities

This project will involve one or more of the following activities:

  • a review of state of the art in software architecture, engaging visualisations, and traceable data cleaning
  • designing the software architecture for the Praeclarus framework
  • investigating the best way to present the results of the detection and the repair of data quality issues
  • developing plugins for the Praeclarus software framework in the form of algorithms to detect and repair data quality issues in process-data
  • strategies for maintaining the provenance of data during the repair process
  • communicating the findings in publications.

Outcomes

The prospective outcomes depend on the scope of the project and may include the following:

  • literature review
  • the software architecture for the Praeclarus framework
  • design and development of prototypes for the detection and the repair of data quality issues in the form of plugins for the Praeclarus software framework
  • best-practice data provenance strategies
  • best-practice visualisations of the results of the detection and repair of data quality issues.

Skills and experience

This project needs one or more of the following skills:

  • preliminary knowledge of process mining and software development
  • programming standalone or web-based applications using e.g., Java, JavaScript, HTML, CSS, PHP, Angular, Python
  • time management skills to deliver outcomes within a specific time frame.
  • excellent written and verbal communication skills.

Scholarships

You may be eligible to apply for a research scholarship.

Explore our research scholarships

Keywords

Contact

Contact the supervisor for more information.