Section: Home

Units

QUT Course Structure
Enterprise Data Mining and Data Analysis

Unit code: INB342
Contact hours: 3 per week
Credit points: 12
Information about fees and unit costs

This unit will provide a comprehensive theoretical coverage of various topics in data and web mining. In addition there will be a significant practical component using hands on tools to solve real-world problems. Specifically, we will consider techniques from machine learning, data mining, text mining, and information retrieval to extract useful knowledge from data which are used for business intelligence, document databases, site management, personalization, and user profiling. This unit will first cover a detailed overview of the mining process and techniques, and then concentrate on applications of these techniques to web, e-commerce, document databases and data from advanced applications.


Availability
Semester Available
2013 Semester 2 Yes

Sample subject outline - Semester 2 2013

Note: Subject outlines often change before the semester begins. Below is a sample outline.

Rationale

With so many organizations housing massive data sources, data mining has become an increasingly popular way to turn an organization's data into useful information and knowledge about their customers and business processes. Data mining has become a well-known area in information technology because of its direct applications in social networks, information retrieval, search-engines, e-commerce, digital libraries, bioinformatics, web information systems and many other utilities.

This unit will teach students the concepts of data and web mining, both of which aim to improve decision making by discovering and making use of knowledge which might be found in large databases and on the Web.

Aims

This unit will provide a comprehensive theoretical coverage of various topics in data and web mining. In addition there will be a significant practical component using hands on tools to solve real-world problems. Specifically, we will consider techniques from machine learning, data mining, text mining, and information retrieval to extract useful knowledge from data which are used for business intelligence, document databases, site management, personalization, and user profiling. This unit will first cover an overview of the mining process and techniques, and then concentrate on applications of these techniques to the web, e-commerce and document databases.

Objectives

This unit is designed to allow you to develop skills related to intelligent data querying and management to support decision support systems through a widely used commercial tool.

On completion of this unit, you should be able to:

  1. Understand and analyse the effectiveness of data and Web mining methods and tools when applied to real-world problems;
  2. Plan and manage your mining projects effectively from the start and avoid pitfalls in data preparation, modelling, and results interpretation;
  3. Identify appropriate problems for data/Web mining and Integrate data/Web mining solutions into business and technical infrastructures of organizations; and
  4. work collaboratively in small groups in order to maximise efficiency of managerial decisions related to enterprise data mining projects.

Content

The following topics will be covered.

  • Data Mining and Knowledge Discovery
  • The KDD process and methodology;
  • Data preparation for knowledge discovery
  • Classification and prediction
  • Clustering
  • Link Analysis
  • Web Mining
  • Text mining
  • Data preparation for web mining
  • User tracking and profiling
  • Web personalization and recommender systems
  • Advanced Applications

    Approaches to Teaching and Learning

    This subject will be delivered through the following means:
    Lectures (2 hours) which provide the theoretical basis of the subject;
    Practicals (1 hour) which allow you to apply theory to practical (industry data-driven) problems using available software tools and implementation exercises. This unit will mainly use SAS Enterprise Miner as Data Mining Software, however, you will be exposed to other software such as Data Analyst with Microsoft server, a popular open-source software WEKA and Oracle Data Miner.

    The learning process will be focused on real-world scenarios. Emphasis will be placed on theoretical work, laboratory exercises and case studies. The exercises will be designed to reinforce key concepts and to assist in the completion of assessments. Problem handling assessments will be drawn from typical industry applications and real world data sources. You are also encouraged to use data from your field of interest. The case study and the projects will be structured according to students' demand.

    Assessment

    Criterion-Referenced Assessment
    Assessment criteria will be made available to you at the introduction of each assignment.The assessments will be marked and returned to you within two weeks of submission. Lecturers and tutors are available during consultation hours to clarify the content of the assessments and to provide constructive feedback on assessments upon completion.

    A review lecture and discussion will be held in the last week. A review lecture will also be held in week 7 to revise the basic data mining concepts.

    Assessment name: Case Study
    Description: Case Study 1: Predictive and Descriptive Data Mining. It includes mining meaningful information from the underlying data after applying predictive, clustreing and link mining techniques.
    Relates to objectives: All
    Weight: 15%
    Internal or external: Internal
    Group or individual: Group
    Due date: Week 7

    Assessment name: Quiz/Test
    Description: Mid-Semester Test covering the Basic Data Mining Concepts.
    Relates to objectives: 1, 2 and 3
    Weight: 40%
    Internal or external: Internal
    Group or individual: Individual
    Due date: Week 9

    Assessment name: Case Study
    Description: Case Study 2: Text and Web Data Mining. It includes mining meaningful information from the text and Web data.
    Relates to objectives: 1, 2 and 3
    Weight: 10%
    Internal or external: Internal
    Group or individual: Individual
    Due date: Week 11

    Assessment name: Project (applied)
    Description: A theoretical survey or practical implementation project to apply data mining for modern applications.
    Relates to objectives: All
    Weight: 35%
    Internal or external: Internal
    Group or individual: Group
    Due date: Week 14

    Academic Honesty

    QUT is committed to maintaining high academic standards to protect the value of its qualifications. To assist you in assuring the academic integrity of your assessment you are encouraged to make use of the support materials and services available to help you consider and check your assessment items. Important information about the university's approach to academic integrity of assessment is on your unit Blackboard site.

    A breach of academic integrity is regarded as Student Misconduct and can lead to the imposition of penalties.

    Resource materials

    Text Book(s):

    Author: J. Han and M. Kamber, Title: Data Mining Concepts and Techniques, Morgan Kaufmann, 2006 This book is available as an e-book in the library.

    This book mainly contains the material covered in lectures from week 1 to week 7. Sufficient materials will be provided to you via handouts or online links for the lectures from week 8 to week 12.


    Followings will also be used in addition to the text book:

  • Lecture notes on Blackboard.
  • Various selected papers from the literature (provided via Blackboard).


    You are strongly encouraged to read recommended references and articles pertaining to this unit.

    Reference(s):

    Various selected papers from the literature.

    Author: P.Cabena, P. Hadjinian, R. Stadler, J. Verhees, and A. Zanasi, Discovering Data Mining: From Concept to Implementation, ISBN 0-13-743980-6, 1997
    Author: J. Han and M. Kamber, Data Mining Concepts and Techniques, Morgan Kaufmann, 2001
    Author: Soumen Chakrabarti Mining the Web: discovering knowledge from hypertext data, Morgan Kaufmann, c2003
    Author: Mena J, Data Mining Your Website, Digital Press, 1999.

    No extraordinary charges or costs are associated with the requirements of this unit.

    top
    Risk assessment statement

    There are no out of the ordinary risks associated with this unit. It is your responsibility to familiarise yourself with the Health and Safety policies and procedures applicable within Faculty campus areas and laboratories.

    Disclaimer - Offer of some units is subject to viability, and information in these Unit Outlines is subject to change prior to commencement of semester.

    Last modified: 12-Jun-2012