MLC++ is a standard C++ library for supervised machine learning, with back-end and front-end tools for data mining tasks like Decision Trees, and Clustering. Information on legal issues, mailing lists, history, standards, platform support, and downloa...
Written in Python, the toolbox handles caching of database queries and parallelism within a collection of independent queries. Our toolbox provides a number of routines for basic data mining tasks on top of which the user can add more functions - main...
A freely available software toolkit for finding frequent patterns in diverse datasets. It contains highly efficient algorithms for finding patterns in transactional, sequential, and graph datasets.
Uses the Minimum Message Length (MML) principle to do mixture modeling. Mixture modeling concerns modeling a statistical distribution by a mixture of other distributions, and is also known as unsupervised concept learning in Artificial Intelligence. L...
A freely available software toolkit for clustering low- and high-dimensional data sets. It is well-suited for clustering data sets arising in many areas including information retrieval, customer purchasing transactions, science, and biology.
By David Chickering at Microsoft Research. The WinMine Toolkit is a set of tools for Windows 2000/NT/XP that allow you to build statistical models from data. The majority of the tools are command-line executables that can be run in scripts.
GPL C/C++ software for data analysis of discrete data using principal/independent component methods. Examples are DPCA, LDA, GaP (like PLSI and NMF). Targetted at text, with MPI and multithreading.