Multi-Task Learning Framework for Stance Detection and Veracity Prediction

Tracking #: 2827-4041

Authors: 
Fatima T. Alkhawaldeh
Tommy Yuan
Dimitar Kazakov1

Responsible editor: 
Maria Maleshkova

Submission type: 
Full Paper
Abstract: 
As more people rely on online media, it becomes more challenging to identify trustworthy information. As a result of this increased complexity, stance detection and rumour detection have gained prominence. Although both tasks are highly correlated and should be performed concurrently, most existing models train them independently. Additionally, while each target topic may contain numerous conflicting claims, previous work treated each claim independently, resulting in conflict claims wrongly assigned with the same truth label. Because some lengthy rumour posts cover a wide range of topics, determining the positions of the posts can be done with a variety of target topics. Existing models may take a biased position toward the correct target topic or the incorrect target topic, resulting in an incorrect determination of veracity. The purpose of this article is to address these problems by proposing a framework for stance detection and veracity prediction that takes into account source credibility and compares the strength of arguments in order to forecast the truth. Experiments are conducted using two well-known datasets: Emergent and RumourEval-2019. On the gold-standard datasets, the results demonstrate that the proposed framework outperforms other methods
Full PDF Version: 
Tags: 
Reviewed

Decision/Status: 
Major Revision

Solicited Reviews:
Click to Expand/Collapse
Review #1
Anonymous submitted on 11/Aug/2021
Suggestion:
Major Revision
Review Comment:

(1) originality
The paper proposes a novel multi-task learning mechanism to jointly predict rumour stance and veracity to improve stance detection, considering the fact that both tasks are highly correlated. While the tasks may be of interest to the semantic web community, the methods used in this paper were rather solely based on NLP. I am not sure if this paper is a good match for the Semantic Web journal, I do not see much relevance to the journal's scope. The authors should clearly describe the novelty of their work in terms of the Semantic Web methods. I also recommend that they look at the literature (e.g., DOI: 10.3233/SW-2012-0073) on how argumentation can be represented and how it can affect rumour stance and veracity prediction.

(2) significance of the results
The results look significant, but it is difficult to assess the reproducibility of the results as no code has been shared.

(3) quality of writing.
Overall, the writing quality is acceptable, but adding a background section on stance and veracity detection, and argumentation-based truth discovery will improve writing quality.

Review #2
Anonymous submitted on 12/Nov/2021
Suggestion:
Reject
Review Comment:

Summary:
The core work of this article is on identifying trustworthy information on social media, which is challenged by several different problems, such as target topics containing numerous conflicting claims. The authors presented a multi-task learning framework for stance detection and veracity prediction, namely Argumentation-based Truth Discovery Model, to discover multiple truths from conflicting sources. Experimental results on Emergent and Rumour Eval-2019 Task A+B showed the performance of the proposed model.

(1) Originality:
To the best of my knowledge, it is a novel idea to apply multi-tasking to stance detection and veracity prediction. Many similar works exist, such as
https://aclanthology.org/D19-6603.pdf
https://arxiv.org/pdf/2007.07803v2.pdf
https://aclanthology.org/D19-1485/
https://aclanthology.org/C18-1288/
Also, its main contributions to the knowledge of the SWJ community are not apparently significant.

(2) Significance of the results:
The results on two public datasets (Emergent, Rumour Eval-2019 Task A + B) demonstrated the effectiveness of the proposed methods. Plus, the authors had 9 observations from the results. I think it is hard to show the significant contributions to the SWJ community, not only due to the less novelty.

(3) Quality of writing
This article is not easy to follow, nor has a high quality of writing. In addition to typos (e.g., see Section 3.3, line 63, “target’=”), and non-standard mathematical notations, there are many ungrammatical sentences (e.g., see Section 1, para 5, line 1-3, or see Section 3.3, para 1, line 16-18). Also, this article is not very concise in describing the core work.

This article did not provide any publicly available resources (e.g., source codes, demonstrations) for replication of experiments, even though public datasets (Emergent, Rumour Eval-2019 Task A+B) for the stance detection and veracity prediction were used.

This article is lengthy, especially in terms of describing the architectures of different components of their proposed model. The descriptions or explanations are excessive, the reasons are as follows:
(1) the descriptions can be replaced with respective clear architectures, such as clause section component in paragraph 3 of Section 3.4, article (relevant clauses), and claim encoder and decoder in Section 3.4.1 and Section 3.4.2.
(2) The part that different components also use, like GRU, should be not described again. See paragraph 3 of Section 3.4, and Section 3.4.1.
(3) It is suggested that all of the architecture diagrams in the article should be re-drawn since they are unable to give readers any detailed information about the proposed model and its several components in a direct way.
(4) Please see paragraph 4 of Section 3.4, the authors repeatedly explain the principle and the attention mechanism. Similarly, see paragraph 5 of the same section, the softmax layer is repeated.

In paragraph 2 of Section 3.2, the authors mentioned this work employs a pointer generator architecture with attention and copy mechanisms to create a claim-target topic-based generator.
What is a pointer generator with attention and its architecture? Using only the single green box in Fig.2 and several lines to explain it is not sufficient.
What are copy mechanisms used? Sorry, I can not find any formal descriptions about them.
What is JSP in Fig 3.? Is it JSD (Jensen-Shannon Divergence)?
The mathematical notations of this article are extremely not uniform and standard, and unclear. For example, In Section 3.1: [h1,…,ht] (See paragraph 7), g, j, k (See paragraph 8), l, F, j, Fl (see paragraph 9), q(k), alpha(k) (see paragraph 10, not the same with equation (3)). This issue exists throughout the article.
The article lacks the most important architecture diagram of the multi-task learning and the soft parameter sharing network. It is suggested that the diagram be added, and also the formulas be added.
In paragraphs 5 and paragraph 6 of Section 3.4, the authors mentioned the loss function but did not provide its formal definition, please add it. Moreover, the authors mentioned this model trained by cross-entropy, but the loss function is computed the cosine similarity between target topic embedding and hidden state of the t-th clause. How this model is trained? It is suggested that more details be provided.
In paragraph 2 of Section 4.5, what is target topic aware target-specific based claim?

The reviewer believes that the paper is not related to the topics of the semantic web journal. This work is out of the scope of this journal since it does not use any existing or their own KGs.

Review #3
Anonymous submitted on 04/Apr/2022
Suggestion:
Major Revision
Review Comment:

Multi-Task Learning Framework for Stance Detection and Veracity Prediction

(1) originality
The paper proposes an approach for stance detection and veracity prediction, with an emphasis on the benefit of handling both tasks together, instead of independently as in most existing approaches. The work also includes a framework that considers the credulity of the source as well as the strength of the arguments, while determining or forecasting the truth.

The domain at hand is very rich in terms of approaches and contributions. Therefore, it is quite difficult to propose a completely novel and original approach. In this context, the paper is incremental by considering previous approaches in the field and offering added value by combining stance detection and veracity prediction. Still, the authors have submitted an original piece of work.

The addressed topics have a long tradition in the Semantic Web community, especially when it comes to fact recognition and entity detection. However, this paper does not use semantic technologies as the core of the proposed approach. It is crucial that the authors state the connection between Semantic Web and their work. Otherwise, an AI or NLP journal would probably be a better option for a submission.
(2) significance of the results

The results of the evaluation and experiments show that the authors can demonstrate the expected improvement from combining the two aspects. Here the significance for the semantic web community should be clearly stated. This is definitively missing.
Detailed comments:
Introductions:
„Because it is difficult and expensive to hire qualified journalists and other experts to verify published posts,“ – I do not believe that this is the main issue. Rumors can be released on purpose and with the growing numbers of websites, it is hardly possible to manually validate all. Furthermore, some content is event automatically or semi-automatically generated. The issue is way beyond finding good journalists. It is more that it cannot be handled manually.

“This work addresses three issues identified from the literature that contribute to the failure of veracity prediction systems to achieve acceptable detection performance” – The statement is really strong and it can hardly be said that current solutions are a failure. Be more specific about the problem – are the systems not good enough or is the problem too complex. What is exactly the issue, also in relation to your work.

“As a result, the two tasks, stance detection and veracity prediction can be learned concurrently to maximise their utility.” – This cannot be stated as a fact in the introduction. Instead I would strongly suggest that this is defined as a hypothesis that is validated by the work presented in the paper.
“previous models attempted to detect the general stance without considering the primary or the most concerned target topic.“ – Here it would be very helpful to introduce as example. This would also help if there is an example that clearly shows how stance detection and veracity prediction when combined are actually more accurate.
“Each claim's target topic is extracted independently. As a result, the target topics with the most similar embeddings to the primary target topic is selected for analysis alongside the target topic. Rumors from reliable sources are weighted heavily in the outcome, whereas rumours from unreliable sources are ignored.“ – This approach can be very tricky, since there can be bias even when it comes to true facts. People can have different view because of historical background, political views, religious beliefs and because of that talk about a fact in a different way. In that way bias would actually falsify the results.
I would strongly suggest restructuring the introduction by stating a hypothesis, describing the aspects that will be investigated and the corresponding contributions. The focus of the contributions should be on the research work and not the implementation. Currently, the impression is that the paper presents an implementation of a framework. Furthermore, I would suggest adding a motivating example and removing any judgement (without clear proof) about reasons or state of the art.
Related Work
The overviews given in the summary tables are really nice and helpful. It takes a lot of work to create such summaries. Still, I would strongly suggest to better classify the related papers. What is of interest is what features are used in the approach and what is the specific target of each paper. You can still keep the measures but it would be very helpful to have some further comparisons, since you have done the analysis.
“In general, there are four types of methods for truth discovery that have been used in previous research.:“ – it is not clear where this statement comes from. This needs to be motivated or rephrased.
2.5 Analysis
Ideally, the content of this section should come directly based on the summary tables in the related work section. Otherwise the statements come as a bit of a surprise and are not motivated.
3.1 Overview
This is not a suitable place to introduce the model. This should be done in a separate designated section.
Fig. 1 – This is not the best place to introduce the model. It might also be useful to state the differences to multi-modal learning approaches, since the architecture seems to be very similar.
“If the probability is >=0.5, then the source is selected as a candidate trustworthy source.“ How was this determined?
“If the probability is >= 0.5, then the claim is selected as a candidate truth.“ How was this determined?
(3) quality of writing.
The paper can benefit from a bit of restructuring. This is especially true for the introduction and the experiments and results sections. Furthermore, it should be clearly separated between implementation decisions and work on the approach. Currently, there is a mix between the two from time to time.

The line of argumentation should be improved. No statement should be made without clear motivation or saying that it is taken as an assumption. Currently, there are a number of statements, which are not clearly motivated.

Detailed comments:

Section “Experiments and Results "
This is a bit too late to introduce research questions. I would suggest to move them to the introduction and in this section to say how these were specifically evaluated.

There seems to be a problem with the formula formatting (25) and (26) on page 14. vt1,i? Also (30). Why is vt2 not normalised with 1/n?
Formula (50) – is this the correct way to define a concatenation? Same for (48)
Double-check formula (65)
„X1={s,f,h} ,“
“(e.g. …) and its supporting replies (e.g. …) are “ ??
“Manhattan LSTM model [105]is used because….”??

The formulas should be double-checked and put in the same format.

Some sections are repetitive, especially when it comes to the motivation of the work and the fundamentals used.

I would strongly suggest grouping the mathematical formulas into smaller blocks and to use a diagram or ideally an architecture to guide the reader. Otherwise, it is quite difficult to follow which formula preforms what function in what part of the big-picture.