Section: Home

Units

QUT Course Structure
Enterprise Data Mining

Unit code: INN342
Contact hours: 3 per week
Credit points: 12
Information about fees and unit costs

This unit will provide a comprehensive theoretical coverage of various topics in data and web mining. In addition there will be a significant practical component using hands on tools to solve real-world problems. Specifically, we will consider techniques from machine learning, data mining, text mining, and information retrieval to extract useful knowledge from data which are used for business intelligence, document databases, site management, personalization, and user profiling. This unit will first cover a detailed overview of the mining process and techniques, and then concentrate on applications of these techniques to web, e-commerce, document databases and data from advanced applications.


Availability
Semester Available
2013 Semester 2 Yes

Sample subject outline - Semester 2 2013

Note: Subject outlines often change before the semester begins. Below is a sample outline.

Rationale

With so many organizations housing massive data sources, data mining has become an increasingly popular way to turn an organization's data into useful information and knowledge about their customers and business processes. Data mining has become a well-known area in information technology because of its direct applications in social networks, information retrieval, search-engines, e-commerce, digital libraries, bioinformatics, web information systems and many other utilities.

This unit will teach you the concepts of data and web mining, both of which aim to improve decision making by discovering and making use of knowledge which might be found in large databases and on the Web.

Aims

This unit will provide a comprehensive theoretical coverage of various topics in data and web mining. In addition, there will be a significant practical component using hands on tools to solve real-world problems. Specifically, we will consider techniques from machine learning, data mining, text mining, and information retrieval to extract useful knowledge from data which are used for business intelligence, document databases, site management, personalization, and user profiling. This unit will first cover a detailed overview of the mining process and techniques, and then concentrate on applications of these techniques to the Web, e-commerce, document databases and data from advanced applications.

Objectives

This unit is designed to allow you to develop skills related to intelligent data querying and management to support decision support systems through a widely used commercial tool.

On completion of this unit, you should be able to:

  1. Understand and analyse the effectiveness of data and Web mining methods and tools when applied to real-world problems;
  2. Plan and manage your mining projects effectively from the start and avoid pitfalls in data preparation, modelling, and results interpretation;
  3. Identify appropriate problems for data/Web mining and Integrate data/Web mining solutions into business and technical infrastructures of organizations; and
  4. Work collaboratively in small groups in order to maximise efficiency of managerial decisions related to enterprise data mining projects.

Content

The following topics will be covered:
· Data Mining and Knowledge Discovery
· The KDD process and methodology
· Data preparation for knowledge discovery
· Classification and prediction
· Clustering
· Link Analysis

Web Mining

· Text mining
· Data preparation for web mining
· User tracking and profiling
· Web personalization and recommender systems
· Advanced Topics/Applications such as Privacy and user rights, etc

Approaches to Teaching and Learning

This subject will be delivered through the following means:
Lectures (2 hours) which provide the theoretical basis of the subject;
Practicals (1 hour) which allow students to apply theory to practical (industry data-driven) problems using available software tools and implementation exercises. This unit will mainly use SAS Enterprise Miner as Data Mining Software, however, students will be exposed to other software such as Data Analyst with Microsoft server, a popular open-source software WEKA and Oracle Data Miner.

The learning process will be focused on real-world scenarios. Emphasis will be placed on theoretical work, laboratory exercises and case studies. The exercises will be designed to reinforce key concepts and to assist in the completion of assessments. Problem handling assessments will be drawn from typical industry applications and real world data sources. Students are also encouraged to use data from their field of interest. The case study and the projects will be structured according to students' demand.

Concurrent Teaching
This unit is being taught concurrently with an undergraduate offering of the same subject. University policy permits that postgraduate and undergraduate students attend the same lectures. Separate practical/discussion groups will be provided for postgraduate students where student numbers allow. As a postrgraduate student you will be required to complete separate or additional tasks.

Assessment

Criterion-Referenced Assessment:
Assessment criteria will be made available to students at the introduction of each assignment.
A review lecture and discussion will be held in the last week. A review lecture will also be held in week 7 to revise the basic data mining concepts.The assessments will be marked and returned to you within two weeks of submission. Lecturers and demonstrators are available during consultation hours to clarify the content of the assessments and to provide constructive feedback on assessments upon completion.

Assessment name: Case Study
Description: Case Study 1: Predictive and Descriptive Data Mining.
It includes mining meaningful information from the underlying data after applying predictive, clustreing and link mining techniques.
Relates to objectives: All.
Weight: 15%
Internal or external: Internal
Group or individual: Group
Due date: Week 7

Assessment name: Quiz/Test
Description: Mid-Semester Test covering the Basic Data Mining Concepts.
Relates to objectives: 1, 2 and 3.
Weight: 40%
Internal or external: Internal
Group or individual: Individual
Due date: Week 9

Assessment name: Case Study
Description: Case Study 2: Text and Web Data Mining.
It includes mining meaningful information from the text and Web data.
Relates to objectives: 1, 2 and 3.
Weight: 10%
Internal or external: Internal
Group or individual: Individual
Due date: Week 11

Assessment name: Project (applied)
Description: A theoretical survey or practical implementation project to apply data mining for modern applications.
Relates to objectives: All.
Weight: 35%
Internal or external: Internal
Group or individual: Group
Due date: Week 14

Academic Honesty

QUT is committed to maintaining high academic standards to protect the value of its qualifications. To assist you in assuring the academic integrity of your assessment you are encouraged to make use of the support materials and services available to help you consider and check your assessment items. Important information about the university's approach to academic integrity of assessment is on your unit Blackboard site.

A breach of academic integrity is regarded as Student Misconduct and can lead to the imposition of penalties.

Resource materials

Text Book (s):
Author: J. Han and M. Kamber, Title: Data Mining Concepts and Techniques, Morgan Kaufmann, 2006

This book is available as an e-book in the library. Access to this e-book from library is free. Some hard-copies of this book are also available in the QUT bookshop to buy from. This book mainly contains the material covered in lectures from week 1 to week 7. Sufficient materials will be provided to you via handouts or online links for the lectures from week 8 to week 12.

Followings will also be used in addition to the text book:

Lecture notes on Blackboard.
Various selected papers from the literature (provided via Blackboard).


You are strongly encouraged to read recommended references and articles pertaining to this unit.

Reference(s):

Author: P.Cabena, P. Hadjinian, R. Stadler, J. Verhees, and A. Zanasi,
Title: Discovering Data Mining: From Concept to Implementation, ISBN 0-13-743980-6, 1997

Author: J. Han and M. Kamber,
Title: Data Mining Concepts and Techniques, Morgan Kaufmann, 2001

Author: Soumen Chakrabarti
Title: Mining the Web : discovering knowledge from hypertext data Morgan Kaufmann, c2003

Author: Mena J
Title: Data Mining Your Website, Digital Press, 1999.

No extraordinary charges or costs are associated with the requirements of this unit.

top
Risk assessment statement

There are no out of the ordinary risks associated with this unit.

There are no out of the ordinary risk associated with this unit. It is your responsibility to familiarize yourself with the Health and Safety policies and procedures applicable within FIT campus areas and laboratories.

Disclaimer - Offer of some units is subject to viability, and information in these Unit Outlines is subject to change prior to commencement of semester.

Last modified: 23-May-2012