Abstract:
Social media's ever-growing popularity led to the emergence of the Social Semantic Web, as an assembly of collective knowledge systems. This class of frameworks, algorithms, and tools aims to retrieve, process, and represent knowledge from human contributions. In this paper, we introduce a collective knowledge system to model the transformation of online conversations over time, allowing stakeholders to easily observe trends and behavior patterns. Our framework relies on an original, graph-based algorithm, called Dynamic Discussion Topics Illustrator, that builds "semantic evolutionary maps" of user discussion topics, which we call Discussion Topic Flows. The Discussion Topic Flows result from matching comment clusters from sequential time windows, according to their semantic similarity. The proposed system integrates the following phases: dataset preparation, text clustering, topic extraction, and finally, the employment of the Dynamic Discussion Topics Illustrator algorithm. We exemplify our method in a popular use case: automated extraction of user feedback from online software forums. For this purpose, we collect a real-world dataset of submissions posted on the Fedora dedicated subreddit: r/Fedora, over the entire year 2021. We evaluate the correctness of the results from three distinct perspectives: i) the comment clusters quality, assessed using three popular internal measures, ii) the Discussion Topic Flows' structure, expressed by their length and events quantity, and iii) the Discussion Topic Flows' explainability, measured through their comprising topics coherence.