RDFRules: Making RDF Rule Mining Easier and Even More Efficient

Tracking #: 2511-3725

This paper is currently under review
Authors: 
Vaclav Zeman
Tomas Kliegr
Vojtěch Svátek

Responsible editor: 
Agnieszka Lawrynowicz

Submission type: 
Full Paper
Abstract: 
AMIE+ is a state-of-the-art algorithm for learning rules from RDF knowledge graphs (KGs). Based on association rule learning, AMIE+ constituted a breakthrough in terms of speed on large data compared to the previous generation of ILP-based systems. In this paper we present several algorithmic extensions to AMIE+, which make it faster, and the support for data pre-processing and model post-processing, which provides a more comprehensive coverage of the linked data mining process than does the original AMIE+ implementation. The main contributions are related to performance improvement: the top-k approach, which addresses the problem of combinatorial explosion often resulting from a hand-set minimum support threshold, a grammar that allows to define fine-grained patterns reducing the size of the search space, and a faster projection binding reducing the number of repetitive calculations. Other enhancements include the possibility to mine across multiple graphs, the support for discretization of continuous values, and the selection of the most representative rules using proven rule pruning and clustering algorithms. Benchmarks show reductions in mining time of up to several orders of magnitude compared to AMIE+. An open-source implementation is available under the name RDFRules at https://github.com/propi/rdfrules.
Full PDF Version: 
Tags: 
Under Review