Review Comment:
The paper describes an ontology prototype for representing and managing SEO-related information. Such an ontology can be very useful, both for managing information at a project- or client-level, and for the SEO experts' community as a whole.
Unfortunately, both the ontology and the paper are suffering from substantial limitations, which make the submission unsuitable for publication in an academic journal. A more elaborated ontology could be very useful; unfortunately, the current approach is just an ad-hoc database schema expressed in OWL with negligent deployment.
**Note:** There is some similarity with the article already published online at .
## Presentation and Paper
The paper would benefit from a more objective writing style.
## Usage and Description of Design Principles and Methodologies
The paper cites some established methods from the field, namely BFO and Competency Questions. These are unfortunately not really used during the design of the ontology. The functional and non-functional requirements seem quite arbitrary and remain generic; it would be unclear how the ontology could be evaluated against these and they are not used for a serious evaluation of the results.
The Competency Questions (p. 6) are not as I would expect them: Historically, CQs were used to define the scope of an ontology by defining (or giving examples of) queries that should be answered with the help of the conceptual elements of the final ontology. The authors would benefit from reading the original Uschold/Gruninger paper from 1996: Uschold, M., Grüninger, M., 1996. Ontologies: Principles, Methods, and Applications. Knowledge Engineering Review 11, 93–155, in particular section 6.4.
## Comparison with other ontologies on the same topic
The ontology itself is to my knowledge novel; there is not similar ontology that would need to be used for comparison. However, the alignment with schema.org and other relevant ontologies leaves room for improvement (see below for more details). There is relevant work on fundamental ontologies covering the core concepts of the WWW architecture, e.g. Halpin/Presutti: The identity of resources on the Web: An ontology for Web architecture, Applied Ontology, Volume 6, Issue 3, pp. 263 - 293, 2011.
## References to applications or use-case experiments
Section 5 describes application scenarios, but remains rather superficial.
## Quality and relevance of the described ontology
The scope of the ontology looks promising; unfortunately, the current design and deployment is so limited that it insufficient for broad usage.
### Conceptual Modeling Perspective
While the information to be captured by the ontology is useful and valuable in real-world SEO tasks and projects, the ontology elements and their relationships look rather ad-hoc and not very carefully designed. A few examples:
1. The core model is not well-aligned with the architecture of the WWW and respective terminology, namely resource, identifier, and representation. This will lead to avoidable inconsistencies (e.g. when dealing with canonical URIs or multiple syntactical forms of the same Web content). The authors should try to re-use the standard building blocks, it would make the ontology more lasting and useful.
For instance, the ontology defines (or locally redefines) these three classes:
-
-
-
Now, one could argue that in an SEO context, URI/URLs are indeed something different from the ideal of the Web architecture, as e.g. a popular, high-ranking target URI might be considered an asset in its own right. But if you want to go that route, it does not make sense to model it as an owl:FunctionalProperty and owl:InverseFunctionalProperty like so:
```
### https://w3id.org/seovoc/hasURL
:hasURL rdf:type owl:ObjectProperty ,
owl:FunctionalProperty ,
owl:InverseFunctionalProperty ;
rdfs:domain :WebPage ;
rdfs:range :URL ;
rdfs:comment "The hasURL property establishes a unique and reciprocal relationship between a WebPage and its corresponding URL. It asserts that each WebPage is identified by exactly one URL, and conversely, each URL uniquely identifies one WebPage. As both a functional and inverse functional property, hasURL ensures that this link is both unique and bidirectional, which is critical for accurately representing the identity and accessibility of web content" ;
rdfs:isDefinedBy .
```
2. The ontology defines multiple properties for variants of the same characteristic that differ just by the unit of measurement or just by the the reference to another value. That approach makes the ontology unnecessarily inflexible yet increases the number of properties.
Examples:
-
-
The ontology could be improved by getting inspiration from, or reusing, the model for values, units of measurement, and value references in schema.org, namely , , and .
3. In general, the ontology does not support n-ry relationship types in a very convincing way, and is using a very flat model. If I understand it correctly, most data is attached to a Web page entity. Think of better ways of representing e.g.. See e.g. and .
4. and are practically useful, but it might be better to also support URIs as the range for both embeddings and embedding models and not just strings. Also, embeddings are typically understood as *vectors* of **numerical data.** There is work on sharing embeddings in RDF (e.g. ), but I think it will be better to distinguish an embedding from its representation and use standard Web architecture elements (resource, representation, mime types, ... ) for modeling them.
For instance, if an embedding is obtained from a truly RESTful API, then it may directly be a Web resource, and its representation could e.g. be in JSON or JSON-LD, like this example (taken from ):
```json
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [
-0.006929283495992422,
-0.005336422007530928,
-4.547132266452536e-05,
-0.024047505110502243
],
}
],
"model": "text-embedding-3-small",
"usage": {
"prompt_tokens": 5,
"total_tokens": 5
}
}
```
### Implementation and Deployment
The base URI for all elements is `https://w3id.org/seovoc/` using identifiers like (slash URIs), but they are not de-referenceable.
The whole deployment is also not well-suited for an ontology with slash-based URIs, as even if set up properly, the entire ontology would be returned as a representation.
```bash
curl -I https://w3id.org/seovoc/hasQuery
HTTP/1.1 307 Temporary Redirect
...
Location: https://raw.githubusercontent.com/wordlift/wl-ontology/main/SEOntology.owl
Content-Type: text/html; charset=iso-8859-1
curl -I https://raw.githubusercontent.com/wordlift/wl-ontology/main/SEOntology.owl
HTTP/2 404
content-type: text/plain; charset=utf-8
```
All in all, the ontology is not properly deployed and there has been an [open issue in Github highlighting several key problems with the proper deployment according to the state-of-the art since Feb 21, 2024](https://github.com/seontology/seontology/issues/1), e.g.
- serves the HTML representation of a directory listing (see screenshot below).
- The anchor text **readme.md** points to , which returns a 307 redirect to .
```bash
curl -I https://w3id.org/seovoc/readme.md
HTTP/1.1 307 Temporary Redirect
Date: Mon, 13 Jan 2025 19:40:44 GMT
Server: Apache/2.4.29 (Ubuntu)
Access-Control-Allow-Origin: *
Location: https://raw.githubusercontent.com/wordlift/wl-ontology/main/SEOntology.owl
Content-Type: text/html; charset=iso-8859-1
```
- That URI returns a 404 status:
```bash
curl -I https://raw.githubusercontent.com/wordlift/wl-ontology/main/SEOntology.owl
HTTP/2 404
...
```
Image: HTML representation served at https://w3id.org/seovoc/
**Also, there is NO HTML representation being served for the ontology.**
|