How to Add Your Own Learning System
The benchmarking framework assumes to find for each tool to consider a folder named as the tool's identifier, i.e. ideally all lowercase, without whitespaces, in the
There one can find the currently available learning systems:
$ ls learningsystems/ aleph dllearner funclog golem progol progolem README.md toplog
To add a new tool just create a new directory under
learningsystems named as your tool identifier, e.g.
mytool directory there should be at least two executable files named
Furthermore, a file
system.ini should be provided that specifies the language of the knowledge base and input examples.
Run the Learning System
The purpose of the
run executable is to run your inductive learning tool, writing the learned hypotheses to a file.
Currently, the expected parameters are:
$ ./run <config_file>
The config file contains information where the tool should store its output and which example files it should read.
An example of the content of the output file generated by the Golem ILP tool would be
active(A) :- carbon_5_aromatic_ring(A,[B,C,D,E,F]). active(A) :- hetero_aromatic_5_ring(A,[B,C,D,E,F]), nitro(A,[F,G,H,I]). active(A) :- nitro(A,[B,C,D,E]), phenanthrene(A,[[F,G,H,I,J,K],[L,M,N,O,P,Q],[R,S,T,U,V,W]]), bond(A,I,B,7), bond(A,X,I,7).
Currently there are no fixed specifications, how to store the learned results. However the content of this file should be processable in the validation step
The validation step is performed by the second executable,
validate, which should be called as follows:
$ ./validate <config_file>
The config file contains information where the tool should store its output and which results input file and example files it should read.
This executable reads the results, loads the background knowledge of the considered learning task, and checks how many of the positive/negative examples of the considered learning problem are covered. Learning on OWL knowledge bases, this means utilizing an OWL reasoner to run instance checks on the learned DL concepts. In case of Prolog-based background knowledge a Prolog interpreter has to be executed to check how many of the positive and negative examples are covered. The output generated by the
validate executable should be just four lines written to the
<validation_output_file>: One line for the number of true positives, one for the number of false positives, one for the number of true negatives and one line for the number of false negatives. An example for the content of
<validation_output_file> would be:
tp: 10 fp: 3 tn: 29 fn: 0
Tool Configuration Files
Tool-specific configuration settings are defined per learning problem and should be held in a file named like the tool identifier with the file suffix
Such a configuration file should be placed inside the considered learning problem directory.
For example, the tool-specific configuration files of the Prolog-based tools for learning problem 42 the Mutagenesis learning task can be found here:
$ ls -1 learningtasks/mutagenesis/prolog/lp/42/*.conf learningtasks/mutagenesis/prolog/lp/42/aleph.conf learningtasks/mutagenesis/prolog/lp/42/funclog.conf learningtasks/mutagenesis/prolog/lp/42/golem.conf learningtasks/mutagenesis/prolog/lp/42/progol.conf learningtasks/mutagenesis/prolog/lp/42/progolem.conf learningtasks/mutagenesis/prolog/lp/42/toplog.conf
The framework will combine this file with the config file passed to the tool.
The actual processing of settings made inside such a configuration file should be done by the
Tool Specific Data
If a learning task requires tool-specific data, e.g. specific mode declarations etc., these can be put into a directory named like the tool identifier residing inside the data directory of the corresponding learning task. An example for Aleph-specific data for the Mutagenesis task can be found here:
$ ls learningtasks/mutagenesis/prolog/data/aleph/ mode.pl
In case of Prolog-based learning tools such data files must have the file suffix
For OWL-based learning tools this should be one of the standard file suffixes for the common serialization formats (