Publication

Text analysis for automatically identifying fake news

Azevedo, Lucas Lourenço de Sousa
Citation
Abstract
As a consequence of the ever-growing speed and quantity of both production and con sumption of information, added to factors as news source decentralization, ‘citizen journalism’, democratization of media and astroturfing1 (Lee, 2010), a subjective and often misleading depiction of facts characterizes the post-truth era (Dale, 2017) from where Fake News emerges, a phenomenon that is effectively shaping the perception of reality for many individuals (Waldrop, 2017). In this scenario, manually checking and correcting disinformation across the internet is impractical if not infeasible (Shao, Ciampaglia, et al., 2016) and a fast and reliable way to perform fact-checking becomes imperative. Supervised learning is a promising solution for automatic fact-checking but is hindered by the lack of suitable training data, i.e., sufficiently large amounts of organic news articles annotated in regards to their veracity. With that in mind, we present the two cores of the project: 1) the Veritas Dataset: the most complete data collection of manually annotated claims in regards to their veracity. It is the only dataset to contain not only the veracity label for a checked claim, but also the whole document that originated this claim, which allows it to be also valuable on developing a number of related tasks, namely: Document Retrieval, Stance Detection and Claim Validation. 2) LUX (Language Under eXamination), a deep learning classifier for Fake News that makes use of the unique completeness of Veritas’ data in order to, given a text document, evaluate a set of linguistic aspects that were shown to be correlated to deception using them as features for the classifier, inferring its likelihood of being a piece of fake news. Different experiments were performed, with varying State-of-the-Art (SOTA) language models, data collections and hyper-parameters and a comprehensive ablation analysis is also provided.
Funder
Publisher
NUI Galway
Publisher DOI
Rights
Attribution-NonCommercial-NoDerivs 3.0 Ireland
CC BY-NC-ND 3.0 IE