Big data computing in distributed or cloud environments

Study level


Topic status

We're looking for students to study this topic.


Professor Glen Tian
Division / Faculty
Science and Engineering Faculty


Big data is data with large volume, fast and dynamic generation and diversity of data formats. Their management, storage, retrieval and processing is a challenge due to these features.

In distributed computing environments, the MapReduce pattern and its Haddoop and Smark implementations are widely used for big data computing. However, they are not directly suitable for many real-world applications such as some bioinformatics problems and other all-to-all comparison problems.

As well as that, the efficient utilisation and scheduling of the resources for each of the distributed machines are still a challenge for big data computing.

Research activities

This research focuses on development of innovative theory, new front-end programming models and back-end technology support for big data computing in distributed or cloud environments.

It also investigates applications of big data computing in real-world systems, such as:

  • networks and communications
  • complex networks
  • power grids
  • transport systems
  • spacial information
  • social media applications.


We expect this research to result in the development of:

  • innovative theory
  • new front-end programming models
  • new back-end resource management/scheduling technologies.

We also will include implementations of big data computing for real-world systems.

Skills and experience

We expect you to have the following skills and experiences:

  • general computing and software development experience
  • programming skills in Java and C/C++
  • knowledge of optimization and its heuristic solutions.


You may be able to apply for a research scholarship in our annual scholarship round.

Annual scholarship round



Contact the supervisor for more information.