Solving Guesstimation Problems Using the Semantic Web: Four Lessons from an Application

Tracking #: 427-1558

Alan Bundy
Gintautas Sasnauskas
Michael Chan

Responsible editor: 
Krzysztof Janowicz

Submission type: 
We draw on our experience of implementing a semi-automated guesstimation application of the SemanticWeb to draw four lessons, which we claim are of general applicability. These are: 1. Inference can unleash the Semantic Web; 2. The Semantic Web does not constrain the inference mechanisms; 3. Curation must be dynamic; and 4. Own up to uncertainty.
Full PDF Version: 

Minor Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Simon Scheider submitted on 15/Mar/2013
Minor Revision
Review Comment:

This is a very nice description of a tool/system (GORT) which can be used to "guesstimate" answers to quantitative questions based on retrieving facts from the semantic web and applying a range of formal algebraic inference rules which reflect human guesstimation strategies. The tool is compared to existing ones such as BotE, QUARK, and IBM Watson. It is special in its inference calculus (which mimicks human guesstimation strategies) and in that it propagates uncertainty through inference and curates data based on a special kind of normalisation.

The paper is well written, fun to read, and it suggests a number of interesting lessons learned, which I can only affirm. For an application paper, however, it lacks a quantitative comparison of performance with existing tools. However, this seems the only drawback of the paper. So, I suggest the authors should include some kind of quantitative comparison/evaluation of the tool.

Some minor issue:
2.1.1: the definition of approximate predicates uses nf~ applied to a boolean value, however, nf~ takes a rational as input. To me this seems to be an
2.1.2: Better explain the rewrite rule (LHS and RHS). Also, the source of the primary methods should be referenced (Have the authors invented them or do they reuse other work?)
The distance methods approximates travel distance by spherical distance. This can be grossly wrong depending on traffic infrastrucure. There are navigation services for improving accuracy of distance measures. Why not use them?
3.3. It seems to me that a lot of the described curation problems are due to unclear semantics (e.g., unclear reference systems for numbers). There are ontologies for measurement units (e.g. in this journal: "Ontology of Units of Measure and Related Concepts"). Why not use them? In general, it seems that ontologies are preciesely there in order to enable dynamic curation methods. Could the authors comment on this?

Review #2
Anonymous submitted on 24/Mar/2013
Minor Revision
Review Comment:

The article reports on GORT (Guesstimation with Ontologies and Reasoning Techniques), a system for formalizing the guestimation process. Guestimates are approximate answers to questions like, “How many solar panels would it take to power all the houses in the UK”. At the heart of GORT is their SINGSIGDIG Calculus (The Single Significant Digit Calculus). GORT retrieves numbers from the semantic web with SINDICE search engine. The paper also discusses some dynamic curation methods employed in GORT, such as how to handle dropped units and inconsistent numerical values.

() It would have been easier for me to understand the definitions and methods underlying GORT by providing one or two simple numerical examples that would be carried throughout the paper. Essentially, the examples I would like to see would involve expanding the solar panel example used in the Count Method on p. 3 throughout the article. These examples could be embedded in the text, or perhaps better would be to illustrate these different definitions and methods with a Figure along the lines of Figure 2.

() The description of the evaluation on p. 5 is sparse. But as far as I can tell, the evaluation only tells us about how GORT performs with human curated data vs. data retrieved from SINDICE. However, this doesn’t let us evaluate the SINGSIGDIG Calculus or GORT’s dynamic curation methods. SINGSIGDIG Calculus could be evaluated by making GORT geustimates for questions that could be verified independently. These could be questions such as:

“How teachers work for the Chicago public school system?”

“How many solar panels are sold in the UK?”

The dynamic curation methods could be evaluated by turning them on and off and re-running the 12 questions currently used to evaluate the system.

() What are readers expected to know? Do the authors expect their readers to know what Hilbert’s operator is? Or perhaps, this is a question more for the editors. OWL and RDF triples are terms that simply will not be familiar to the vast majority of cognitive scientists. As many of these terms will be needed to understand more than one of the articles, it would make sense to have a semantic web primer or recommend one that covers some of the background terms and concepts needed to understand the articles..

() The acronym GORT is used on page 2 and there a mysterious reference to GORT 4.0 on page 4, even though the GORT is not defined until page 5.

Table 4 (“Questions used in evaluation of GORT 4.0”) on page 8 should come before Table 2 (Evaluation results for GORT 4.0”) that shows up page 6.