This website provides access to a variety of datasets suitable for data mining and machine learning experimentation.  Links in the table below link to .zip folders that contain .xls, .csv, and .arff versions of the dataset.  The .zip folder also includes .pdf file descriptions of the dataset, as well as .pdf reference articles, where available.


Each .arff file was created from the associated .csv file, which was created from the associated .xls file.  Each .arff dataset has been loaded into Weka Explorer to insure that the file is in a useable format.


The datasets have been collected from a number of publically-accessible, including primarily the UC Irvine Machine Learning Repository.




CPU Performance

Diabetes Diagnosis

Edible Mushrooms

Fisher's Iris Dataset

Fractionation Column

Gamma Ray Bursts

Landform Identification

Sensor Discrimination

To Play or Not To Play

To Play Or Not To Play Numeric

Voting Record Yay Nay

Wine Cultivars

Wine Quality


For problems with this web site, contact the webmaster