A software package to discover motifs (highly conserved regions) in groups of related DNA or protein sequences and, search sequence databases using motifs. [Commercial]
Software toolkit for building and using motif-based hidden Markov models of DNA and proteins. There is an online interactive version. Source written in C. [GPL]
An integrated collection of Java code useful for statistical natural language processing, document classification, clustering, information extraction, and other machine learning applications to text. [GPL]
A library of C code useful for writing statistical text analysis, language modeling, and information retrieval programs. The current distribution includes the library, as well as front-ends for document classification (rainbow), document retrieval (ar...
A program implementing several memory-based learning techniques. These learners store representation of the training set explicitly, and classifies new cases by extrapolation from the most similar stored cases. [AFL]
An object orientated environment for machine learning in Matlab. Algorithms can be plugged together and can be compared with (e.g. model selection, statistical tests and visual plots). Algorithms may be downloaded separately. [GPL]
Software which allows one to navigate (fly) through the data tree, zoom in on interesting nodes, click on bars to get counts, and mark interesting places in the tree. Includes datasets for automobiles, voting, produce, and medical research. Uses LEDA,...