Units
Enterprise Data Mining
Unit code: INN342
Contact hours: 3 per week
Credit points: 12
Information about fees and unit costs
This unit will provide a comprehensive theoretical coverage of various topics in data and web mining. In addition there will be a significant practical component using hands on tools to solve real-world problems. Specifically, we will consider techniques from machine learning, data mining, text mining, and information retrieval to extract useful knowledge from data which are used for business intelligence, document databases, site management, personalization, and user profiling. This unit will first cover a detailed overview of the mining process and techniques, and then concentrate on applications of these techniques to web, e-commerce, document databases and data from advanced applications.
Availability
| Semester | Available |
|---|---|
| 2013 Semester 2 | Yes |
Sample subject outline - Semester 2 2013
Note: Subject outlines often change before the semester begins. Below is a sample outline.
Rationale
With so many organizations housing massive data sources, data mining has become an increasingly popular way to turn an organization's data into useful information and knowledge about their customers and business processes. Data mining has become a well-known area in information technology because of its direct applications in social networks, information retrieval, search-engines, e-commerce, digital libraries, bioinformatics, web information systems and many other utilities.
This unit will teach you the concepts of data and web mining, both of which aim to improve decision making by discovering and making use of knowledge which might be found in large databases and on the Web.
Aims
This unit will provide a comprehensive theoretical coverage of various topics in data and web mining. In addition, there will be a significant practical component using hands on tools to solve real-world problems. Specifically, we will consider techniques from machine learning, data mining, text mining, and information retrieval to extract useful knowledge from data which are used for business intelligence, document databases, site management, personalization, and user profiling. This unit will first cover a detailed overview of the mining process and techniques, and then concentrate on applications of these techniques to the Web, e-commerce, document databases and data from advanced applications.
Objectives
This unit is designed to allow you to develop skills related to intelligent data querying and management to support decision support systems through a widely used commercial tool.
On completion of this unit, you should be able to:
- Understand and analyse the effectiveness of data and Web mining methods and tools when applied to real-world problems;
- Plan and manage your mining projects effectively from the start and avoid pitfalls in data preparation, modelling, and results interpretation;
- Identify appropriate problems for data/Web mining and Integrate data/Web mining solutions into business and technical infrastructures of organizations; and
- Work collaboratively in small groups in order to maximise efficiency of managerial decisions related to enterprise data mining projects.
Content
The following topics will be covered:
· Data Mining and Knowledge Discovery
· The KDD process and methodology
· Data preparation for knowledge discovery
· Classification and prediction
· Clustering
· Link Analysis
Web Mining
· Text mining
· Data preparation for web mining
· User tracking and profiling
· Web personalization and recommender systems
· Advanced Topics/Applications such as Privacy and user rights, etc
Approaches to Teaching and Learning
This subject will be delivered through the following means:
Lectures (2 hours) which provide the theoretical basis of the subject;
Practicals (1 hour) which allow students to apply theory to practical (industry data-driven) problems using available software tools and implementation exercises. This unit will mainly use SAS Enterprise Miner as Data Mining Software, however, students will be exposed to other software such as Data Analyst with Microsoft server, a popular open-source software WEKA and Oracle Data Miner.
The learning process will be focused on real-world scenarios. Emphasis will be placed on theoretical work, laboratory exercises and case studies. The exercises will be designed to reinforce key concepts and to assist in the completion of assessments. Problem handling assessments will be drawn from typical industry applications and real world data sources. Students are also encouraged to use data from their field of interest. The case study and the projects will be structured according to students' demand.
Concurrent Teaching
This unit is being taught concurrently with an undergraduate offering of the same subject. University policy permits that postgraduate and undergraduate students attend the same lectures. Separate practical/discussion groups will be provided for postgraduate students where student numbers allow. As a postrgraduate student you will be required to complete separate or additional tasks.
Assessment
Criterion-Referenced Assessment:
Assessment criteria will be made available to students at the introduction of each assignment.
A review lecture and discussion will be held in the last week. A review lecture will also be held in week 7 to revise the basic data mining concepts.The assessments will be marked and returned to you within two weeks of submission. Lecturers and demonstrators are available during consultation hours to clarify the content of the assessments and to provide constructive feedback on assessments upon completion.
Assessment name:
Case Study
Description:
Case Study 1: Predictive and Descriptive Data Mining.
It includes mining meaningful information from the underlying data after applying predictive, clustreing and link mining techniques.
Relates to objectives:
All.
Weight:
15%
Internal or external:
Internal
Group or individual:
Group
Due date:
Week 7
Assessment name:
Quiz/Test
Description:
Mid-Semester Test covering the Basic Data Mining Concepts.
Relates to objectives:
1, 2 and 3.
Weight:
40%
Internal or external:
Internal
Group or individual:
Individual
Due date:
Week 9
Assessment name:
Case Study
Description:
Case Study 2: Text and Web Data Mining.
It includes mining meaningful information from the text and Web data.
Relates to objectives:
1, 2 and 3.
Weight:
10%
Internal or external:
Internal
Group or individual:
Individual
Due date:
Week 11
Assessment name:
Project (applied)
Description:
A theoretical survey or practical implementation project to apply data mining for modern applications.
Relates to objectives:
All.
Weight:
35%
Internal or external:
Internal
Group or individual:
Group
Due date:
Week 14
Academic Honesty
QUT is committed to maintaining high academic standards to protect the value of its qualifications. To assist you in assuring the academic integrity of your assessment you are encouraged to make use of the support materials and services available to help you consider and check your assessment items. Important information about the university's approach to academic integrity of assessment is on your unit Blackboard site.
A breach of academic integrity is regarded as Student Misconduct and can lead to the imposition of penalties.
Resource materials
Text Book (s):
Author: J. Han and M. Kamber, Title: Data Mining Concepts and Techniques, Morgan Kaufmann, 2006
This book is available as an e-book in the library. Access to this e-book from library is free. Some hard-copies of this book are also available in the QUT bookshop to buy from. This book mainly contains the material covered in lectures from week 1 to week 7. Sufficient materials will be provided to you via handouts or online links for the lectures from week 8 to week 12.
Followings will also be used in addition to the text book:
Lecture notes on Blackboard.
Various selected papers from the literature (provided via Blackboard).
You are strongly encouraged to read recommended references and articles pertaining to this unit.
Reference(s):
Author: P.Cabena, P. Hadjinian, R. Stadler, J. Verhees, and A. Zanasi,
Title: Discovering Data Mining: From Concept to Implementation, ISBN 0-13-743980-6, 1997
Author: J. Han and M. Kamber,
Title: Data Mining Concepts and Techniques, Morgan Kaufmann, 2001
Author: Soumen Chakrabarti
Title: Mining the Web : discovering knowledge from hypertext data Morgan Kaufmann, c2003
Author: Mena J
Title: Data Mining Your Website, Digital Press, 1999.
No extraordinary charges or costs are associated with the requirements of this unit.
Risk assessment statement
There are no out of the ordinary risks associated with this unit.
There are no out of the ordinary risk associated with this unit. It is your responsibility to familiarize yourself with the Health and Safety policies and procedures applicable within FIT campus areas and laboratories.
Disclaimer - Offer of some units is subject to viability, and information in these Unit Outlines is subject to change prior to commencement of semester.
Last modified: 23-May-2012