Review Comment:
This manuscript was submitted as 'full paper' and should be reviewed along the usual dimensions for research contributions which include (1) originality, (2) significance of the results, and (3) quality of writing.
The paper significantly improved since the initial submission. It has been reorganised and partly rewritten to consider most of the reviewer suggestions.
My remaining concerns are:
- The related work section has been enriched. However it lacks of a deep comparison with existing entity matching works, appart from that the STEM approach can be used on the top of any pairwise numerical threshold-based classifier. Some ensemble learning approaches should be motioned even is they do not deal with entity matching (https://renespeck.de/data/2014/ISWCpaper.pdf).
- I really liked the problem formulation section but it misses a summary paragraph which gives a formulation of the problem in terms of an ensemble learning problem that considers a set of entity matching decisions provided by different threshold-based systems.
- Section 4.2 is clearer now and support the soundness of the proposed approach. May be the authors should give an idea of how \lambda in equation (22) is estimated (it is important to be convinced by the equations (26) and (27))?
- minor remarks:
- paragraph before definition 3, “… e1 and e2 is carried out on a set OF literal value …”
in definition 5: add a line breaking.
- section 4.3: “However, as a rule of thumb, ….. that: O(N ∗ g2) < Ttrain(N, g)(N, g) < O(N ∗ g3)” ==> ““However, as a rule of thumb, ….. that: O(N ∗ g2) < Ttrain(N, g)< O(N ∗ g3)”
- section 4.3: use the latex symbol ‘\leq’ instead of ‘<=‘
|