Storage, Partitioning, Indexing and Retrieval in Big RDF Frameworks: A Survey

Tracking #: 2230-3443

This paper is currently under review
Tanvi Chawla
Girdhari Singh
Emmanuel S. Pilli
M.C Govil

Responsible editor: 
Peter Haase

Submission type: 
Survey Article
Resource Description Framework (RDF) is increasingly being used to model data on the web. RDF model was designed to support easy representation and exchange of information on the web. RDF is queried using SPARQL, a standard query language recommended by W3C. The growth in acceptance of RDF format can be attributed to its flexible and reusable nature. The size of RDF data is steadily increasing as many government organizations and companies are using RDF for data representation and exchange. This resulted in the need for developing distributed RDF frameworks that can efficiently manage RDF data on large scale i.e. Big RDF data. These scalable distributed RDF data management systems competent enough to handle Big RDF data can also be termed as Big RDF frameworks. The proliferation of RDF data has made RDF data management a difficult task. In this survey, we provide an extensive literature on Big RDF frameworks from the aspect of storage, partitioning, indexing, query optimization and processing. A taxonomy of the tools and technologies used for storage and retrieval of Big RDF data in these systems has been presented. The research challenges identified during the study of these systems are elaborated to suggest promising directions for future research.
Full PDF Version: 
Under Review