Review Comment:
This manuscript was submitted as 'full paper' and should be reviewed along the usual dimensions for research contributions which include (1) originality, (2) significance of the results, and (3) quality of writing.
This paper present original work related to the semantic digital library of Henri Poincaré correspondence implemented in Omeka S content management system expanded by the RDFS infrastructure to enable SPARQL query system.
The presented work is significant for semantic web community, specially for those in Digital humanity field.
A prototype for different types of queries is available on-line giving unique opportunity to search such valuable collection. Advanced query approximation and elastic query research will probably give new light on this subject.
The user interfaces are available classical interface (based on Solr full text ), search with the SPARQL language querying, a form-based interface (more classical) and an interface using a graphical view.
Paper is written well, but few suggestion related too terminology is given.
Minor changes and explanations are needed.
Sphinx is not referenced, authors should add a link and write that it is an open source full text search server.
It is clear that mathematical search on formulae (like in or similar) is not implemented, but is there some ideas of supporting math ontology and some specific querying of mathematical content?
In right column, p.3. rows 39-43 variables in formulae are not explicitly stated that p, q, r are properties (predicates), x and y nodes, C and D are...
In r3 formula is it "subc" or "subp", probably is typo?
SPARQL querying seems more frequent term than SPARQL interrogation. Authors should rethink about used term.
Please clarify: "assumed to be modulo RDFS entailment,"
p.5 "A specific RDFS base had to be installed." It is clear that turtle syntax is used for RDF, but what type of database or end-point solution is used for RDF store? More technical details for implementation of SPARQL endpoint and RDFS base are needed.
More details statistics related to triples by classes, by properties etc. would give an insight on the digital collection.
Has been used any type of normalisation for Solr index, eg. are tokens lemmatized, stemmed,...?
One more terminological issue: corpus vs. digital library. The result from "corpus system" (in NLP) is generally a concordance from corpus text, chunks retrieved from several documents, while for "digital library" a query result is a list of documents. So, a unit of response is a part of text vs. whole document. Also corpus is usually annotated with grammatical information, that is not included in this system. In my opinion, the term "semantic digital library" is more suitable than corpus for this collection and system.