Task
Part 1 – Online Quiz
There will be an online quiz during Week 11 (22nd Jan – 28th Jan). The online quiz must be attempted by the students individually on the Interact site for ITC516. The Quiz is worth 5 marks of the overall marks available for assessment 3.
Students need to attempt and finish the online quizzes within the specific date and time window.
You will have an hour (1 hour) to complete the quiz which consists of 30 multiple choice questions. Once you start your quiz, you must complete it in one sitting. You only have one attempt at the quiz.
The topics of the online quizzes are:
Introduction to Data Mining
Data Interpretation
Knowledge Representation
Overview of Basic Algorithms and Credibility
Decision Trees
Classification Rules
Association Rules
Linear Models
Instance-Based Learning and Clustering
Part 2 - Practical and Report
There are two steps to complete in this task:
Step 1: You are required to perform a data mining task to evaluate different classification algorithms. Load the soybean.arff data set into Weka and compare the performance on this data set for three classification algorithms:
Decision Tree
Naive Bayes
k-Nearest Neighbour
Step 2: From step 1 outputs, write a report that shows the performance of the different algorithms and comment on their accuracy using the confusion matrix and other performance metrics used in Weka. In your report consider:
Is there a difference in performance between the algorithms?
Which algorithm performs best?
Your report should Include the necessary screenshots, tables, graphs, etc. to make your report understandable to the reader.
The task is worth 15 marks of the overall marks available for assessment 3.
Rationale
These tasks aim to assess your progress towards:
be able to identify and analyse business requirements for the identification of patterns and trends in data sets;
be able to appraise the different approaches and categories of data mining problems; be able to compare and evaluate output patterns;
be able to explore and critically analyse data sets and evaluate their data quality, integrity and security requirements;
be able to compare and evaluate appropriate techniques for detecting and evaluating patterns in a given data set;
be able to explain the importance of current and future trends likely to affect data mining and visualisation.
Marking criteria
The grade you receive for this assessment as a whole is determined by the cumulative marks gained for each question. The tasks in this assessment involve a sequence of several steps and therefore you will be marked on the correctness of your answer as well as clear and neat presentation of your diagrams, where required.
Part 1 - Online Quiz
This part is a series of multiple choice questions. Each correct answer will score 1 mark. Marks will not be deducted for incorrect answers.
Most quizzes will involve multiple choice or true/false type questions, although quizzes may include other contents. Marks will be given based on the correctness of the answers. The Test Centre will be marking automatically and you will receive marks according to the following criteria: HD - At least 85% answers were correct
DI - At least 75% answers were correct
CR - At least 65% answers were correct
PS - At least 50% answers were correct
Part 2 - Practical and Report
Criteria | HD | DI | CR | PS | FL | |||||
The student has | The student has | |||||||||
thoroughly | understood the | The student has | The student has | The student | ||||||
understood the | classification | understood the | understood the | has not | ||||||
classification | methods, | classification | classification | fully understood | ||||||
Demonstrate | methods, providing | providing a | methods, | methods, | the classification | |||||
an ability to | a detailed | detailed | providing a | providing a | methods, | |||||
description of the | description of the | description of the | description of the | providing a | ||||||
analyse, | ||||||||||
methods and its | methods and its | methods and its | methods and its | description of the | ||||||
reason and | ||||||||||
output on the given | output on the | output on the | output on the | methods and its | ||||||
discuss the | ||||||||||
data set. The | given data set. The | given data set. | given data set. | output on the | ||||||
concepts | ||||||||||
discussion | discussion | The discussion | The discussion | given data set. The | ||||||
learned in | ||||||||||
involving the | involving the | involving the | involving the | discussion is not | ||||||
the subject | ||||||||||
validation and | validation and | validation and | validation and | fully involving the | ||||||
and be able | ||||||||||
accuracy of the | accuracy of the | accuracy of the | accuracy of the | validation and | ||||||
to use | ||||||||||
model | model | model | model shows | accuracy of the | ||||||
classification | ||||||||||
methods on | demonstrates | demonstrates | demonstrates | basic | model shows basic | |||||
thorough | detailed | understanding of | understanding of | understanding of | ||||||
datasets. | ||||||||||
understanding of | understanding of | the classification | the classification | the classification | ||||||
the classification | the classification | methods as | methods as | methods as applied | ||||||
methods as applied | methods as | applied to the | applied to the | to the given data | ||||||
to the given data | applied to the | given data set. | given data set. | set. | ||||||
set | given data set. | |||||||||
Presentation
You are recommended to write the answers in a word document and submit it via EASTS. You can also submit your document in pdf format as well.
Your answers to the questions should be precise but complete and informative.
Each question should be answered individually with the corresponding label to indicate the tasks completed e.g. Task 1 a.
Task-2 report should be precise but complete and informative and the number or words within 1500 -1800 words.
No comments:
Post a Comment