Units
Enterprise Data Mining and Data Analysis
Unit code: INB342
Contact hours: 3 per week
Credit points: 12
Information about fees and unit costs
This unit will provide a comprehensive theoretical coverage of various topics in data and web mining. In addition there will be a significant practical component using hands on tools to solve real-world problems. Specifically, we will consider techniques from machine learning, data mining, text mining, and information retrieval to extract useful knowledge from data which are used for business intelligence, document databases, site management, personalization, and user profiling. This unit will first cover a detailed overview of the mining process and techniques, and then concentrate on applications of these techniques to web, e-commerce, document databases and data from advanced applications.
Availability
| Semester | Available |
|---|---|
| 2013 Semester 2 | Yes |
Sample subject outline - Semester 2 2013
Note: Subject outlines often change before the semester begins. Below is a sample outline.
Rationale
With so many organizations housing massive data sources, data mining has become an increasingly popular way to turn an organization's data into useful information and knowledge about their customers and business processes. Data mining has become a well-known area in information technology because of its direct applications in social networks, information retrieval, search-engines, e-commerce, digital libraries, bioinformatics, web information systems and many other utilities.
This unit will teach students the concepts of data and web mining, both of which aim to improve decision making by discovering and making use of knowledge which might be found in large databases and on the Web.
Aims
This unit will provide a comprehensive theoretical coverage of various topics in data and web mining. In addition there will be a significant practical component using hands on tools to solve real-world problems. Specifically, we will consider techniques from machine learning, data mining, text mining, and information retrieval to extract useful knowledge from data which are used for business intelligence, document databases, site management, personalization, and user profiling. This unit will first cover an overview of the mining process and techniques, and then concentrate on applications of these techniques to the web, e-commerce and document databases.
Objectives
This unit is designed to allow you to develop skills related to intelligent data querying and management to support decision support systems through a widely used commercial tool.
On completion of this unit, you should be able to:
- Understand and analyse the effectiveness of data and Web mining methods and tools when applied to real-world problems;
- Plan and manage your mining projects effectively from the start and avoid pitfalls in data preparation, modelling, and results interpretation;
- Identify appropriate problems for data/Web mining and Integrate data/Web mining solutions into business and technical infrastructures of organizations; and
- work collaboratively in small groups in order to maximise efficiency of managerial decisions related to enterprise data mining projects.
Content
The following topics will be covered.
Approaches to Teaching and Learning
This subject will be delivered through the following means:
Lectures (2 hours) which provide the theoretical basis of the subject;
Practicals (1 hour) which allow you to apply theory to practical (industry data-driven) problems using available software tools and implementation exercises. This unit will mainly use SAS Enterprise Miner as Data Mining Software, however, you will be exposed to other software such as Data Analyst with Microsoft server, a popular open-source software WEKA and Oracle Data Miner.
The learning process will be focused on real-world scenarios. Emphasis will be placed on theoretical work, laboratory exercises and case studies. The exercises will be designed to reinforce key concepts and to assist in the completion of assessments. Problem handling assessments will be drawn from typical industry applications and real world data sources. You are also encouraged to use data from your field of interest. The case study and the projects will be structured according to students' demand.
Assessment
Criterion-Referenced Assessment
Assessment criteria will be made available to you at the introduction of each assignment.The assessments will be marked and returned to you within two weeks of submission. Lecturers and tutors are available during consultation hours to clarify the content of the assessments and to provide constructive feedback on assessments upon completion.
A review lecture and discussion will be held in the last week. A review lecture will also be held in week 7 to revise the basic data mining concepts.
Assessment name:
Case Study
Description:
Case Study 1: Predictive and Descriptive Data Mining. It includes mining meaningful information from the underlying data after applying predictive, clustreing and link mining techniques.
Relates to objectives:
All
Weight:
15%
Internal or external:
Internal
Group or individual:
Group
Due date:
Week 7
Assessment name:
Quiz/Test
Description:
Mid-Semester Test covering the Basic Data Mining Concepts.
Relates to objectives:
1, 2 and 3
Weight:
40%
Internal or external:
Internal
Group or individual:
Individual
Due date:
Week 9
Assessment name:
Case Study
Description:
Case Study 2: Text and Web Data Mining. It includes mining meaningful information from the text and Web data.
Relates to objectives:
1, 2 and 3
Weight:
10%
Internal or external:
Internal
Group or individual:
Individual
Due date:
Week 11
Assessment name:
Project (applied)
Description:
A theoretical survey or practical implementation project to apply data mining for modern applications.
Relates to objectives:
All
Weight:
35%
Internal or external:
Internal
Group or individual:
Group
Due date:
Week 14
Academic Honesty
QUT is committed to maintaining high academic standards to protect the value of its qualifications. To assist you in assuring the academic integrity of your assessment you are encouraged to make use of the support materials and services available to help you consider and check your assessment items. Important information about the university's approach to academic integrity of assessment is on your unit Blackboard site.
A breach of academic integrity is regarded as Student Misconduct and can lead to the imposition of penalties.
Resource materials
Text Book(s):
Author: J. Han and M. Kamber, Title: Data Mining Concepts and Techniques, Morgan Kaufmann, 2006 This book is available as an e-book in the library.
This book mainly contains the material covered in lectures from week 1 to week 7. Sufficient materials will be provided to you via handouts or online links for the lectures from week 8 to week 12.
Followings will also be used in addition to the text book:
You are strongly encouraged to read recommended references and articles pertaining to this unit.
Reference(s):
Various selected papers from the literature.
Author: P.Cabena, P. Hadjinian, R. Stadler, J. Verhees, and A. Zanasi, Discovering Data Mining: From Concept to Implementation, ISBN 0-13-743980-6, 1997
Author: J. Han and M. Kamber, Data Mining Concepts and Techniques, Morgan Kaufmann, 2001
Author: Soumen Chakrabarti Mining the Web: discovering knowledge from hypertext data, Morgan Kaufmann, c2003
Author: Mena J, Data Mining Your Website, Digital Press, 1999.
No extraordinary charges or costs are associated with the requirements of this unit.
Risk assessment statement
There are no out of the ordinary risks associated with this unit. It is your responsibility to familiarise yourself with the Health and Safety policies and procedures applicable within Faculty campus areas and laboratories.
Disclaimer - Offer of some units is subject to viability, and information in these Unit Outlines is subject to change prior to commencement of semester.
Last modified: 12-Jun-2012