Engineers at SciFi Ltd. (Woburn Sand, England) have come up with a way to automate the process of extracting meaningful information from massive technical databases, automating a process that can be labor-intensive and time-consuming. RuleMiner finds fuzzy rules implicit within existing databases and codifies them for use by another program that interfaces with real-time control systems.
Essentially, the system discovers the relationships inherent within databases using neural-network learning methods. Such relationships are expressed in terms of fuzzy rules that refer to the variables within a database with structured English sentences. The resulting rule set is readable and can make predictions and classifications of new data.
Using RuleMiner with tabular data in Microsoft "mdb" format, the user specifies whether each field of a database is used as an input. To learn about the relationships within a database, the system first creates a fuzzy decision tree and then tries out fuzzy rules. By reserving a part of the database for testing, it is possible to learn which rules work best using a trial-and-error process on the main database. A user-supplied performance scoring method then allows the generated rule set to be measured against the test data set.
The system works with textual fields and numerical fields, aligning its predictions with whatever type of field the user supplies. For instance, if an integer field only takes on a few values (basic digital techniques-"1" for on and "0"for off), it will be treated as categorical data like a textual field. Real number fields will be predicted as real numbers, and descriptive labels used in textual fields will likewise be used in predictions for textual fields.
Technically, RuleMiner is an ActiveX control that performs its data analysis and rule extraction from Microsoft Jet databases, such as those generated by the Access program. The user chooses a single field to be the "predictive" field; the rest of the fields serve as inputs.
The system determines whether fields represent numerical items or categories by counting the number of different values they take on in the database. Fields found to be numerical, because of the numerous values that they take on, are submitted to a Kohonen unsupervised neural network, which measures the boundaries of the values a field takes on and creates an appropriate fuzzy membership function. Subsequently, a principal component analysis pares down the inputs to those that most significantly predict the user-chosen predictive fields.
A fuzzy tree consisting of all the possible rules for the selected input/output structure is evaluated by RuleMiner with reference to the main database. After the predictive power of each possible fuzzy rule for the main database is measured, the most predictive ones are grouped together and tried against the test set that earlier was set aside. Finally, the selected rules are weighted according to the predictive power of each rule in the set.
To use the fuzzy rule set, another ActiveX control, Fuzzy Control, accepts real-time inputs, submits them to the fuzzy rule base and determines the appropriate predictions implicit within the fuzzy rules.