Dataset Generator (datgen)

Perfect data for an imperfect world

This site hosts a computer program that produces data. The intended use of the program is to help with the empirical analysis of other programs, particularly those that consume data. For example, it can produce data to test sorting programs. Its origins however were for the testing of data mining classifiction programs. The table below presents a sample of a generated dataset:
 

#  A1  A2  A3  A4 Class
1 4.1 3 0 C1
2.8  n/a C2
... 
9,999,999  7.3  11  C1

There are two ways that you can use datgen. The simplest is to use the interactive Web forms below to describe and create your dataset. You can also use datgen on your computer by downloading the program and learning the program's input parameters.

An overview of data generation
Download datgen source code (v3.1 1999/12/14)
An overview of datgen parameters
Use your Web browser to interactivelycreate data with datgen!


Frequently Asked Questions
Things to do and ongoing questions, (volunteers welcomed)
References
Citing datgen
 
Updated 2012/03/07 Comments to Gabor Melli