Weka Tutorials

and Assignments


·         Tutorial Links 

·         Tutorial Overview   

·         Assignment Overview 

·         Running the Tutorials 

·         Datasets 

·         Weka Developers 


Tutorial Links -

 

·         Start-Up 

 

·         Data Mining Concepts 

 

·         Decision Tree Classification 

 

·         Clustering 

 

·         Naïve Bayes Classifier 

 

·         Association 

 

·         Neural Networks 

 

·         Knowledge Flow Environment 

 


Tutorial Overview

 

Weka is a comprehensive tool bench for machine learning and data mining.  Weka is free open source software developed as part of the Weka Machine Learning Project at the University of Waikato in New Zealand, and can be downloaded along with extensive documentation from the project’s web site.

 

Learning about data mining can feel like drinking water from a fire hydrant.  Learning how to use data mining software at the same time can feel like drowning in a sea of information.  These Weka tutorials constitute a parallel path to the documentation provided by Weka developers that will allow new Weka users to get up and running as fast as possible.  Consider these tutorials as water-fountains of information on data mining, the purpose being to provide an introduction and gradual expansion of skills in using data mining tools and the Weka data mining tool bench. 

 

 

These tutorials do not replace the documentation provided by Weka developers.  The tutorials are intended to encourage pursuit of data mining by providing step-by-step solutions to initially simple and increasingly complex real-world data mining problems.

 


Assignment Overview

 

The tutorial links above provide access to .zip folders containing the tutorials and associated audio files, and assignments for each tutorial.  For students taking these tutorials as part of a course, assignments will be prepared as PowerPoint presentations which students will submit to the instructor upon completion.  Each assignment has detailed instructions on what to include in the PowerPoint document submitted by the student.  A solution for the assignment will be provided to the student upon submission of their assignment.  If you are not taking these tutorials as part of a course, you can e-mail the instructor for solutions.

 


Running the Tutorials

 

Each tutorial and its associated assignment is put in a separate .zip file.  To run a tutorial, click on the link above and save the .zip file to your computer.  It is recommended that you unzip each tutorial into its own folder.  This saves the tutorial and assignment .ppt documents files to your computer, along with the audio narration for the tutorial.  You can then run the slide show with audio, and/or print out the slides and speaker notes.  Just click on the speaker icon on each slide to hear the audio narration.  The .ppt presentations are password-protected to edit, but can be viewed as “Read Only” presentations.

 


Datasets

 

These tutorials and assignments use a number of real-world datasets. These datasets are available at:  www.technologyforge.net/Datasets

 


Weka Developers

 

Weka's core developers are Eibe Frank, Mark Hall, and Len Trigg.  Many others have made significant contributions, in particular, Remco Bouckaert, Richard Kirkby, Ashraf Kibriya, Peter Reutemann, Xin Xu, and Malcolm Ware.  The primary reference for Weka is the book:

Ian H. Witten and Eibe Frank (2005) "Data Mining: Practical machine learning tools and techniques", 2nd Edition, Morgan Kaufmann, San Francisco, 2005, ISBN 0-12-088-407-0.


Copyright 2010, Mark Polczynski, All Rights Reserved

For problems with this web site, contact the webmaster