The Materials Design Ontology

Tracking #: 3268-4482

Authors: 
Patrick Lambrix
Rickard Armiento
Huanyu Li
Olaf Hartig
Mina Abd Nikooie Pour
Ying Li

Responsible editor: 
Guest Editors SW for Industrial Engineering 2022

Submission type: 
Full Paper
Abstract: 
In the materials design domain, much of the data from materials calculations is stored in different heterogeneous databases with different data and access models. Therefore, accessing and integrating data from different sources is challenging. As ontology-based access and integration alleviates these issues, in this paper we address data access and interoperability for computational materials databases by developing the Materials Design Ontology. This ontology is inspired by and guided by the OPTIMADE effort that aims to make materials databases interoperable and includes many of the data providers in computational materials science. In this paper, first, we describe the development and the content of the Materials Design Ontology. Then, we use a topic model-based approach to propose additional candidate concepts for the ontology. Finally, we show the use of the Materials Design Ontology by a proof-of-concept implementation of a data access and integration system for materials databases based on the ontology.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Gerhard Goldbeck submitted on 04/Oct/2022
Suggestion:
Accept
Review Comment:

This manuscript was submitted as 'full paper' and should be reviewed along the usual dimensions for research contributions which include (1) originality, (2) significance of the results, and (3) quality of writing. Please also assess the data file provided by the authors under “Long-term stable URL for resources”. In particular, assess (A) whether the data file is well organized and in particular contains a README file which makes it easy for you to assess the data, (B) whether the provided resources appear to be complete for replication of experiments, and if not, why, (C) whether the chosen repository, if it is not GitHub, Figshare or Zenodo, is appropriate for long-term repository discoverability, and (4) whether the provided data artifacts are complete. Please refer to the reviewer instructions and the FAQ for further information.

Review #2
Anonymous submitted on 06/Nov/2022
Suggestion:
Accept
Review Comment:

I would like to thank the Authors for thoroughly addressing all the points I had mentioned in my previous report. I only have one follow-up question that is given below, together with minor points/typos to be double-checked before publication.

As already said in my previous report, given its content, quality and timeliness, I do recommend this manuscript for publication in the Semantic Web journal.

-----------------------------------------------------------------------------------------------
- Clarification needed in Table 3 and related text

Please add the used value of max_support_word. I guess it is 8000 (by comparing with Table 4), but then I do not see why the results in columns 2 and 3 of Table 3 differ (e.g, 6901 vs 6478 for min_support=10). Please clarify.

In fact, from the text I understand that the only difference between ToPMine and ToPMine_max is the use of max_support_word, and that (for the current corpus) this parameter has no effect if set to 8000.
["We also defined a maximum support threshold max_support_word and call the system that uses this additional threshold TopMine_max."
"Note that no word occurs more than 8000 times in our corpus, so setting max_support_word to 8000 allows all words (or, in other words, max_support_word is not used)."]

- Other minor points/typos:

-- In Table 6, row 8: "Artificial Intelligence-Methods (NO)". Please check and in case clarify what "NO" means here.

-- Figure 16: From the values shown, I think the algorithm "with stemming" is used. In case, please specify it in the caption and text.

-- Appendix: please double check that the title and contents of the listings match.

E.g., in Listing 9: the listing seems to duplicate the following one. Some lines (including 11-14) should be removed.
E.g., in Listing 11: please double check the title, namely: "filter condition is complex that needs to be simplified".