Forest Logging: A Trace-Based Analysis of Large Rule-Based Computations

Tracking #: 671-1881

Authors: 
Terrance Swift

Responsible editor: 
Guest Editors RR2013 Special Issue

Submission type: 
Full Paper
Abstract: 
Knowledge representation systems based on the well-founded semantics can offer the degree of scalability required for semantic web applications and make use of expressive semantic features such as Hilog, frame-based reasoning, and defeasibility theories. Such features can be compiled into Prolog tabling engines that have good support for indexing and memory management. However, due both to the power of the semantic features and to the declarative style typical of knowledge representation rules, the resources needed for query evaluation can be unpredictable. In such a situation, users need to understand the overall structure of a computation and examine problematic portions of it. This problem, of {\em profiling} a computation, differs from debugging and justification which address why a given answer was or wasn't derived, and so profiling requires different techniques. In this paper we present a trace-based analysis technique called {\em forest logging} which has been used to profile large, heavily tabled computations. In forest logging, critical aspects of a tabled computation are logged; afterwards the log is loaded and analyzed. As implemented in XSB, forest logging slows down execution of practical programs by a constant factor that is often small; and logs containing tens or hundreds of millions of facts can be loaded and analyzed in minutes.
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Accept

Solicited Reviews:
Click to Expand/Collapse
Review #1
By Erwan Jahier submitted on 30/May/2014
Suggestion:
Accept
Review Comment:

It seems that my remarks have been taken into account, hence I think this article can be accepted.

Review #2
Anonymous submitted on 20/Jun/2014
Suggestion:
Accept
Review Comment:

The article has further improved although the revision is less convincing than expected given the response letter.

"[...] the proof of Theorem 3.2 was significantly rewritten"? At least hardly in terms of formal detail. For me it remains a (convincing) sketch (i.e., conveying the idea of a potential proof but nothing I can really track down step by step without filling missing detail). I agree, however, that a convincing sketch has its merits in particular regarding readability.

"Virtually all minor comments were adopted, [...]."? I do not know how to interpret "virtually" (and do not expect any suggestion to be followed) but please check again since even a reported typo remained.

In summary, the article has further improved and in my opinion, for the reasons mentioned in my previous review, should be accepted.

Review #3
By Maurice Bruynooghe submitted on 04/Jul/2014
Suggestion:
Accept
Review Comment:

Review of Forest Logging: A Trace-Based Analysis of
Large Rule-Based Computations (paper by T Swift)

The revision has addressed all my concerns, hence I recommend
acceptance. Still, the paper was not carefully enough proofreaded and
minor corrections (see below) are needed.

DETAILS

p1-
Much of the literature on knowledge representation
and reasoning (KRR) has concerned
--> better?
Much of the literature on knowledge representation
and reasoning (KRR) is concerned with

For description
logics an example
->
For description
logics, an example

p2

overview: what about section 5?

As SLG and its exten-
sions have been presented in the literature our review
is largely an informal overview;
-->
As SLG and its exten-
sions have been presented in the literature, our review
is largely an informal overview;

p4 def 2.1
and V
is the underlying set of E.
--> better?
and V
is the underlying set of nodes of E.

p5
Fig 2: last literla of last clause
not p(Y) -->
not p(Z)

The
final DELAYING operation is applied to the literal not
p(a) in node 5 is delayed, so that the new selected lit-
eral for its child, node 13, is not p(b).
-->
The
final DELAYING operation is applied to the literal not
p(a) in node 5; in its child, node 15, not p(b) is the new selected
literal.

p6
the conditional answer of node 12,
-->
the conditional answer of node 14,

p9
def 3.1: S and S1 should be identical,
i.e., "(cmp(S , Sscc , c'' )"

def 3.2
(c) If n is an answer or failure node whose near-
est ancestor

the "nearest ancestor" is nothing else than the parent?

d) Otherwise, H(n) is closest ancestor
-->
d) Otherwise, H(n) is the closest ancestor

2. If there is an edge between nodes n1 and n2 in T
there is an edge between H(n1 ) and H(n2 ).
?You mean?
2. If there is an edge between nodes n1 and n2 in T
there is either an edge between H(n1 ) and H(n2 ) or H(n1 )=H(n2 )

p10
by condition 1(c) f A is in H(T ),
-->
by condition 1(c) if A is in H(T ),

an SLG tree for the can be constructed
-->
an SLG tree for the goal can be constructed

char-
acterize the informational maintained in a forest log,
H(T ) there are practical motivations for constructing
H(T ).
-->
char-
acterize the information maintained in a forest log,
H(T ) there are as well practical motivations for constructing
H(T ).

p12
footnote 13:
This statement is true in local evaluation but another common
scheduling strategy called batched evaluation.
-->?
This statement is not only true in local evaluation but also for another common
scheduling strategy called batched evaluation.

The overview also provides the distributions of
tabled subgoals across SCCs of the formed by the
SDGs of the various forests in the evaluation.
-->
The overview also provides the distributions of
tabled subgoals across SCCs formed by the
SDGs of the various forests in the evaluation.

p13
two-valued a ell-founded
-->
two-valued well-founded

p14

which omits facts
produced by the ANSWER RESOLUTION operation the
new answer event.
--> ???
which omits facts
produced by the ANSWER RESOLUTION operation immediately after a
new answer event.

p15
check_variant/2 begins
--> better avoid a lower case to start a sentence
The check_variant/2 predicate begins

p16
The maximal cost of traversing each logged
fact this can be treated as a constant function
-->
The maximal cost of traversing each logged
fact can be treated as a constant function

can reduce the
the logging overhead,
-->
can reduce the
logging overhead,

p17
and under certain conditions has the
information available
-->
and, under certain conditions, has the
information available

ref 7
clp(fd)
-->?
CLP(FD)

p20
a descendent of N such and any intermediate
-->
a descendent of N and any intermediate

p21
as Node’s children..
^^ 2 dots at end of sentence