The Rijksmuseum Collection as Linked Data

Tracking #: 1353-2565

Authors: 
Chris Dijkshoorn
Lizzy Jongma
Lora Aroyo
Jacco van Ossenbruggen
Guus Schreiber
Wesley ter Weele1
Jan Wielemaker

Responsible editor: 
Harith Alani

Submission type: 
Dataset Description
Abstract: 
Many museums are currently providing online access to their collections. The state of the art research in the last decade shows that it is beneficial for institutions to provide their datasets as Linked Data in order to achieve easy cross-referencing, interlinking and integration. In this paper, we present the Rijksmuseum linked dataset (accessible at http://datahub.io/dataset/rijksmuseum), along with collection and vocabulary statistics, as well as lessons learned from the process of converting the collection to Linked Data. This dataset contains over 350,000 objects, including detailed descriptions and high-quality images released under a public domain license.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Eetu Mäkelä submitted on 16/Mar/2016
Suggestion:
Minor Revision
Review Comment:

The authors have fulfilled my requests for clearer presentation of the model details.

However, seeing the literal and resource counts and vocabularies has brought to light a couple of new (minor) questions. First, regarding the dc:subject field, it would be interesting to know how many of the distinct literals have been able to be mapped to resources (if indeed this is how the conversion is done. If the resources are separately manually entered, do note that also). Second, and more importantly, regarding the rma vocabularies, I take it that here all literals would correspond to internal authority records? Is this so? If indeed such a relation holds, then I would actually consider these rma vocabularies conceptually a part of this dataset, and would thus like for them also to be described in more detail in the article.

Final small note: on page 4, a paragraph (beginning with "Table 1 lists") is repeated.

Review #2
By Dana Dannells submitted on 12/Apr/2016
Suggestion:
Minor Revision
Review Comment:

This paper presents the Rijksmuseum collection as linked open data. The authors have improved the paper considerably comparing to the previous version of the manuscript. There is a detailed description of the data conversion process and there are references to relevant vocabularies and statistics on the internal connectivity. I would however raise a few issues.

There is no version date and number of the released data, also the language expressivity is not stated.

The authors specify that conceptual concepts were aligned with WordNet, what disambiguation methods have been applied?

Regrading alignment of concepts and properties of other data models, how do you ensure interoperability, are you applying any consistency checks?

Review #3
By Mariana Damova submitted on 17/Apr/2016
Suggestion:
Accept
Review Comment:

The “Rijksmuseum Collection as Linked Data” introduces a cultural heritage dataset to the Semantic Web community. This version of the paper complies with the requirements of the SWJ for publishing a dataset. It describes the key characteristics of the produced linked data dataset, presents the nature of the data and the context in which they have been created, discusses the model at length, gives a walk through example and explains in a detailed manner the usage of external vocabularies that ensures the conversion of the RDF dataset into a linked data dataset. The URI assignment and construction is outlined in a precise manner. A lot of discussion is dedicated of the usage of the dataset and more precisely to the museum data in general and the Rijksmuseum pioneering practice of publishing the museum collection for open access. The linked data version of the digitized museum collection is demonstrated as extending the possibilities of re-use of the cultural heritage objects of Rijksmuseum by lending them to potential creative users via a dedicated API (its URL given as a footnote).

The paper also mentions how the IP rights of the images and of the museum objects and descriptions have been accounted for. It also details the process of quality assurance and dataset maintenance and extension as part of the day to day activities of the museum curators. In addition, it clearly specifies the size of the dataset, its growth rate and the number and the type of the characteristics covered by its linked data version. The adopted ontologies that describe the dataset are the well-established standards for representing cultural heritage objects and fit with the European model of Europeana – EDM, as well as with the models of leading world museums, such as the Getty vocabulary.

Overall, the paper clearly presents the Rijksmuseum dataset as a linked open data dataset according to the 5 start rating of datasets of that nature. Moreover, the presented dataset is described as a living object with a concrete real life purpose, being part of the daily care of Rijksmuseum curators and researchers, and fully embodies the use of linked open data with real life impact. Thus it leads the way for further adoption of this approach by other cultural institutions. This fact increases the value of the presented content in my view.

The paper can be published as such, except that there is a repetition of one paragraph on p.4 describing Table 1 twice. One of the present paragraphs has to be removed.