Discovery of Newsworthy Events in Twitter

Fernando Duarte, Óscar Mortágua Pereira, Rui L. Aguiar, "Discovery of Newsworthy Events in Twitter", Proc. 3rd IoTBDS - Intl. Conf. on IoT, Big Data and Security, Funchal, Madeira, Portugal, Mar 2018


The new communication paradigm established by Social Media, along with its growing popularity in recent years, have contributed to attract an increasing interest by several research fields. One such research area is the detection of events in Social Media. The purpose of this work is to implement such a system using tweets. A similar system proposed in the literature and chosen due to its scientific relevance is used as the base of this implementation. For this purpose a segmentation algorithm, implemented using a dynamic programming approach, is proposed. Wikipedia is then leveraged as an additional factor in order to rank these segments. The top k of these, are then grouped together according to their similarity. A variant of the Jarvis-Patrick clustering algorithm was implemented in order to achieve this and is also presented. The resulting candidate events were then filtered using an SVM model trained on annotated data in order to retain only those related to real-world newsworthy events. The implemented system was tested with a month of data, representing a total of 1673762 tweets created in Portugal and mostly written in the Portuguese language. The precision obtained by the system was 76.9 % with a recall of 41.6%.


