Review Comment:
## General
The SIMPLE ontology (http://www.languagelibrary.eu/owl/simple/SimpleOntology) as such is not well documented. I opened it in Protege and there wasn't a single label or comment in it documenting the concepts. So if you want look up http://www.languagelibrary.eu/owl/simple/SimpleOntology#hasAgentive there is no documentation whatsoever on the web. The authors should try to include some information there.
For the sake of self-containedness, I would like to see the quality evaluation of the legacy data mentioned in this article (maybe a short summary of the results). There were two EU projects, so I assume there has been some quality control. Which one?
Furthermore the usefulness is obvious, but not well described.
simple:hasIsa ;
simple:hasIsamemberof ;
a simple:Animal, owl:NamedIndividual ;
rdfs:comment " The lemma of USem873animale is animale" ;
rdfs:label " animale_as_Animal" .
One obvious use would be to build an index from the lemmas. But there is really no property that allows to get the lemma.
## Layout
the image is hardly readable, the script should be the same size as in the article. On my print-out there is a lot of space wasted.
## Technical issues:
There are still quite a few technical issues remaining, which should be resolved:
1. http://www.languagelibrary.eu/owl/simple/inds/5/5c2/USem873animale
contains the definition of an Ontology:
Actually this should be removed as it is expressed by:
2. The Ontology at http://www.languagelibrary.eu/owl/simple/SimpleOntology
is provided in functional syntax, which is quite unusual. In fact, I think, only OWL API based tools such as Protege can open this kind of syntax. Even Apache Jena can neither serialize not read it. I am unsure whether it is a standardized syntax at all. Normally, Turtle or RDFXML are used: http://jena.apache.org/documentation/io/#formats
3. It is still weird to duplicate URIs:
http://www.languagelibrary.eu/owl/simple/inds/SimpleEntries#USem59452pub...
will retrieve 15MB of data. So if you crawl this, you will download 15MB for each URI. I think this is really tough on your servers causing a 750MB traffic per crawl.
I think, it is best to simply replace # URIs with '/'. 50k files in the same forlder shouldn't be a problem for a normal file system.
Otherwise you can just use http://www.languagelibrary.eu/owl/simple/USem/59452/pubblico
Does the number have any meaning?
Minor:
- CLIPS is explained in a footnote, I would rather have it lifted to the normal text.
- EuroWordNet is mentioned in the context of "Linked Open Data", but I am unsure, whether it is "open" in any sense. As far as I know there is quite a big fee for obtaining it and it is not even free for science nor open access. Could you please clarify this? I am well aware that the previous reviewer asked to include the reference.
|