ROSETTA is a toolkit for analyzing tabular data within the framework of rough set theory. ROSETTA is designed to support the overall data mining and knowledge discovery process: From initial browsing and preprocessing of the data, via computation of minimal attribute sets and generation of if-then rules or descriptive patterns, to validation and analysis of the induced rules or patterns.
ROSETTA is intended as a general-purpose tool for discernibility-based modelling, and is not geared specifically towards any particular application domain.
ROSETTA offers a highly intuitive GUI environment where data-navigational abilities are emphasized. The GUI is highly object-oriented in that all manipulable objects are represented as individual GUI items, each with their own set of context-sensitive menus.
The computational kernel is also available as a command-line program, suitable for being invoked from, e.g., Perl or Python scripts.
Some features currently offered by the computational kernel include:
- Partial integration with DBMSs via ODBC.
- Exporting of rules, reducts, tables, graphs and other objects to various formats, including XML, C++ and Prolog.
- Completion of decision tables with missing values.
- Discretization of numerical attributes.
- Support for both unsupervised and supervised learning.
- Support for user-defined notions of discernibility.
- Efficient computation of exact or approximate reducts, for various types of discernibility.
- Generation of if-then rules or descriptive patterns via reducts.
- Execution of script files.
- Support for cross-validation.
- Advanced filtering of sets of reducts and rules.
- Validation and analysis
- Application of synthesized rules to unseen examples.
- Generation of confusion matrices, ROC curves and calibration curves.
- Evaluation of individual rules according to advanced measures of quality.
- Utilities for statistical hypothesis testing.
- Clustering via tolerance relations.
- Computation of partitions and variable precision rough set approximations.
- Support for random sampling of observations.
- Open source code.
Computational kernel and GUI front-end designed and implemented at the Knowledge Systems Group, Dept. of Computer and Information Science, Norwegian University of Science and Technology, Trondheim, Norway. Sections of the computational kernel (RSES) developed at the Group of Logic, Inst. of Mathematics, University of Warsaw, Poland.