Overview
Topic status: We're looking for students to study this topic.
Project Summary
This project aims at using anchor text from tweets in order to improve web information retrieval results. The first part of this project will be to build a crawler to collect information from tweeter status about all links. Then a modified ranking function integrating this information (based on BM25 in Lemur or Lucene) will be coded and tested for refinement on the current TREC ClueWeb Collection.
The final working system will then be used for a submission for TREC 2013 Web Track on the new Clueweb collection.
Expected outcomes, applications and/or benefits
This project will lead to improved search models for web information retrieval.
Required student skills/experience
Good programming skills and knowledge of information retrieval and basics of natural language processing (INX344).
- Study level
- Vacation research experience scholarship
- Supervisors
- QUT
- Organisational unit
Science and Engineering Faculty
- Research area
- Keywords
- twitter, search, engine, web, information, retrieval
- Contact
- Contact the supervisor for more information
Dr Laurianne Sitbon