Overview

Topic status: We're looking for students to study this topic.

Project Summary

This project aims at using anchor text from tweets in order to improve web information retrieval results. The first part of this project will be to build a crawler to collect information from tweeter status about all links. Then a modified ranking function integrating this information (based on BM25 in Lemur or Lucene) will be coded and tested for refinement on the current TREC ClueWeb Collection. 

The final working system will then be used for a submission for TREC 2013 Web Track on the new Clueweb collection.

Expected outcomes, applications and/or benefits

This project will lead to improved search models for web information retrieval.

Required student skills/experience

Good programming skills and knowledge of information retrieval and basics of natural language processing (INX344).

Study level
Vacation research experience scholarship
Supervisors
QUT
Organisational unit

Science and Engineering Faculty

Research area

Computer Science

Keywords
twitter, search, engine, web, information, retrieval
Contact
Contact the supervisor for more information