Weka Tutorials
and Assignments
·
Datasets
·
Start-Up
·
Decision Tree Classification
Weka is a comprehensive tool bench for machine
learning and data mining. Weka is free open
source software developed as part of the Weka Machine Learning Project at the University of Waikato
in New Zealand, and can be downloaded along with extensive documentation from
the project’s web site.

Learning about data
mining can feel like drinking water from a fire hydrant. Learning how to use data mining software at
the same time can feel like drowning in a sea of information. These Weka tutorials
constitute a parallel path to the documentation provided by Weka
developers that will allow new Weka users to get up
and running as fast as possible.
Consider these tutorials as water-fountains of information on data
mining, the purpose being to provide an introduction and gradual expansion of
skills in using data mining tools and the Weka data
mining tool bench.
These
tutorials do not replace the documentation provided by Weka
developers. The tutorials are intended
to encourage pursuit of data mining by providing step-by-step solutions to
initially simple and increasingly complex real-world data mining problems.
The
tutorial links above provide access to .zip folders containing the tutorials
and associated audio files, and assignments for each tutorial. For students taking these tutorials as part
of a course, assignments will be prepared as PowerPoint presentations which
students will submit to the instructor upon completion. Each assignment has detailed instructions on
what to include in the PowerPoint document submitted by the student. A solution for the assignment will be
provided to the student upon submission of their assignment. If you are not taking these tutorials as part
of a course, you can e-mail the instructor
for solutions.
Each
tutorial and its associated assignment is put in a
separate .zip file. To run a tutorial,
click on the link above and save the .zip file to your computer. It is recommended that you unzip each
tutorial into its own folder. This saves
the tutorial and assignment .ppt documents files to your computer, along with
the audio narration for the tutorial.
You can then run the slide show with audio, and/or print out the slides
and speaker notes. Just click on the
speaker icon on each slide to hear the audio narration. The .ppt presentations are password-protected
to edit, but can be viewed as “Read Only”
presentations.
These
tutorials and assignments use a number of real-world datasets. These datasets
are available at: www.technologyforge.net/Datasets
Weka's core developers are Eibe
Frank, Mark Hall, and Len Trigg. Many
others have made significant contributions, in particular, Remco
Bouckaert, Richard Kirkby, Ashraf Kibriya, Peter Reutemann, Xin Xu, and Malcolm Ware.
The primary reference for Weka is the book:
Ian H. Witten and Eibe Frank
(2005) "Data Mining: Practical machine learning tools and
techniques", 2nd Edition, Morgan Kaufmann, San Francisco, 2005, ISBN
0-12-088-407-0.
Copyright 2010, Mark
Polczynski, All Rights Reserved
For problems with
this web site, contact the webmaster