Review Comment:
This revision has done a very good job at meeting the original review comments and is now a very strong paper. Perhaps its main contribution, apart from the very practical one of moving towards a more linguistically and culturally balanced wikipedia, is the very extensive and well explained evaluation stages. For the actual range of material generated automatically, this degree of extensive evaluation might seem a little like overkill, but its strenght is that this procedure can then naturally extend when the material generated itself becomes more substantial, and indeed for related generation tasks in this context. The paper is well written and it is now very clear what was happening at each stage and why each design decision was taken. I have just a few extremely minor corrections/typos and suggestions for final improvements, that I list below.
p. 1, l. 21: something odd here, "a 'introductory" --> "an introductory"
p. 2 left:
p. 2, l. 23: "produce one summary sentence" --> "produce a single summary sentence"
p. 2, l. 29: "as it is the case" --> "as is the case"
p. 2, l. 40: "to generate a short Wikipedia-style summary" --> "to generate a Wikipedia-style summary sentence"
p. 2 right:
p. 2, l. 26: "the text is fluent" --> "the sentence is fluent"
p. 2, l. 36: "the summaries generated" --> "the summary sentences generated"
p. 4:
the point here of using the generated versions as a way of keeping a more 'local'
expression that is not just a translation of the English is important and interesting. This
then adds a further potential consideration for the review of previous relevant text generation
work, particularly as given in §2.2. In addition to the work on producing text from
triples and similar, there is also the earlier work on multilingual natural language
generation, which was sometimes argued for as an alternative to translation precisely
because one would then be able to generate text fully within a target culture's norms
and not simply as a translation. Examples of such work and discussion include:
Kruijff, G.-J.; Teich, E.; Bateman, J. A.; Kruijff-Korbayová, I.; Skoumalová, H.; Sharoff, S.; Sokolova, L.; Hartley, T.; Staykova, K. & Hana, J. A multilingual system for text generation in three Slavic languages Proceedings of the 18th. International Conference on Computational Linguistics (COLING'2000), 2000, 474-480
Bateman, J. A.; Matthiessen, C. M. I. M. & Zeng, L. Multilingual natural language generation for multilingual software: a functional linguistic approach. Applied Artificial Intelligence, 1999, 13, 607-639
Hartley, A. & Paris, Cécile. Multilingual document production: from support for translating to support for authoring Machine Translation, 1997, 12, 109-129
and going all the way back to:
Kittredge, R.; Polguère, A. & Goldberg, E. Synthesizing weather reports from formatted data Proceedings of the 11th. International Conference on Computational Linguistics, International Committee on Computational Linguistics, 1986, 563-565
Some of the issues here may become more relevant still when generation moves beyond the single sentence stage.
p. 21, right, l. 39: "the generated summary" --> "the generated summary sentence"
|