Section: Home

Units

QUT Course Structure
Search Engine Technology

Unit code: INN344
Contact hours: 3 per week
Credit points: 12
Information about fees and unit costs

Search engines are becoming ubiquitous not only for finding web pages but also as a key part of companies' infrastructure. Database systems only allow access to structured data which are only the tip of the iceberg of the vast amount of information that also sits in unstructured files such as word documents, reports, email messages, etc. Industry is now realising the high value of this free text information and deploying the means to use it. Processing this information requires natural language processing for extracting meaningful relations and semantics as well as efficient indexing processes that together compose search engine technology.

Today, search technology is a hot area of research and development with applications in data warehousing, e-commerce, digital libraries, bioinformatics, and web information systems in general.


Availability
Semester Available
2013 Semester 2 Yes

Sample subject outline - Semester 2 2012

Note: Subject outlines often change before the semester begins. Below is a sample outline.

Rationale

In this unit you will learn advanced techniques and the foundations of search engine technology. The focus is on the topics and techniques that are most promising for implementing smart information access strategies. This unit builds on top of the core software development skills that you have learnt in earlier units to allow you to create sophisticated search tools.

Aims

This unit aims to equip you with a solid foundation in a broad range of topics that are required in order to become effective in the development of search engine systems. You will understand both traditional and modern concepts and algorithms underlying search engines and related techniques.

You will enhance your generic problem solving skills on complex systems and will develop critical thinking and the ability to read and understand the latest developments in information retrieval research.

Objectives

On successful completion of this unit, you should be able to:

  1. Understand, write, and explain fundamental and advanced search engine algorithms;
  2. Understand and build upon modern natural language processing techniques to improve access to existing information repositories;
  3. Design search engine solutions for information infrastructures utilising or developing appropriate software libraries for smart information access interfaces;
  4. Demonstrate knowledge of the principles and techniques of evaluating systems' performance.

Content

  • Retrieval strategies and utilities; vector space models, probabilistic strategies, relevance feedback, passage retrieval, parsing.
  • Indexing and fundamental data structure in support of efficient and effective information retrieval in large collections.
  • Fundamental text processing techniques, string processing, natural language processing.
  • Interfaces for information access and retrieval: ranking, clustering, extracting, contextualisation.
  • Evaluation of information retrieval systems. The Cranfield methodology. The TREC, INEX, NTCIR, and CLEF evaluation forums.
  • Multilingual information retrieval, distributed information retrieval, parallel information retrieval.

    Approaches to Teaching and Learning

    Three contact hours per week: generally a three hour lecture including tutorial exercises and Q&A sessions and occasionally 3 hours lab. Lectures will guide you through the unit material for each week, in sync with assignment project work. Lectures present overview material and several worked examples. It is expected that you will read lecture notes in advance. Tutorials will consist of general Q&A sessions, help with particular aspects of the unit material, and working through project development. Tutorial questions help develop your understanding of the unit material and also provide preparation for the assignments. Labs will be focussed on practice and will essentially support the assignments.

    Assessment

    The assessments are designed to provide you with a hand-on experience with all aspects of search engine technology. You will be able to implement algorithms for document text processing and retrieval, objectively assess the performance of information retrieval systems, implement a full search engine strategy and manipulate natural language processing and search algorithms within innovative solutions.

  • Assignment 1 will be marked by the unit teaching team, and will be returned to you with critical comment within two weeks.
  • Assignment 2 will be marked by the unit teaching team, and will be returned to you with critical comment within two weeks and before the final exam.
    Tutors and the unit lecturing staff will be available in person at specified times or via email to answer questions in regards to the unit or to provide additional feedback on assignments.
    Feedback on the performance of all project systems will be given during the lecture in week 13.
    A reading task on scientific papers will provide peer feedback as well as feedback from the lecturers through a range of interactions.

    Assessment name: Project (applied)
    Description: Text Processing
    Development of natural language processing and information extraction tools.
    Relates to objectives: 1, 2
    Weight: 25%
    Internal or external: External
    Group or individual: Individual
    Due date: Week 6

    Assessment name: Project (applied)
    Description: Search Engine Design
    The design and integration of search engine and semantic components within an innovative product.
    Relates to objectives: 3, 4
    Weight: 35%
    Internal or external: External
    Group or individual: Individual
    Due date: Week 12

    Assessment name: Examination (Theory)
    Description: Weekly Quiz
    Weekly written quizz, short answer questions on the previous lecture and on reading and discussion tasks over scientific publications.
    Relates to objectives: 1-4
    Weight: 40%
    Internal or external: Internal
    Group or individual: Individual
    Due date: Weekly

    Academic Honesty

    QUT is committed to maintaining high academic standards to protect the value of its qualifications. To assist you in assuring the academic integrity of your assessment you are encouraged to make use of the support materials and services available to help you consider and check your assessment items. Important information about the university's approach to academic integrity of assessment is on your unit Blackboard site.

    A breach of academic integrity is regarded as Student Misconduct and can lead to the imposition of penalties.

    Resource materials

    Textbook(s):
    Search Engines: Information Retrieval in Practice. W. Bruce Croft ,
    Donald Metzler , Trevor Strohman . 2010.

    top
    Risk assessment statement

    There are no unusual health or safety risks associated with this unit.

    Disclaimer - Offer of some units is subject to viability, and information in these Unit Outlines is subject to change prior to commencement of semester.

    Last modified: 28-May-2012