Classifying Fake and Real Neurally Generated News

Anitha Govindaraju, Josephine Griffith

Research output: Chapter in Book or Conference Publication/ProceedingConference Publicationpeer-review

1 Citation (Scopus)

Abstract

In this data era, with Natural Language Processing (NLP) techniques such as 'Language Modelling' showing great progress, it is observed that the idea of 'Automated Journalism' i.e., generating news articles using computer programs based on existing news headlines, or the body of a news article, is emerging. Such advancements not only lead to progress but also to certain disadvantages. Specifically, adversaries are using these techniques to create fake news articles called 'Neural fake news'. Such news imitates the style and appearance of real news to generate targeted propaganda which is used to confuse people. Humans find this neural fake news to be more trustworthy than human- written disinformation [1]. The goal of this research is to classify various types of neurally generated news as real or fake based on its genuineness. In a real world scenario, humans evaluate the genuineness of news by relying on a model of the world, i.e., evaluating whether the content in the news is the same as the content from a reliable news source (e.g., Associated Press). In this work we use a Recurrent Neural Network (RNN), specifically a Siamese Bi-directional LSTM (BiLSTM), to act as a Semantic Textual Similarity (STS) model which compares the real news with neural news to determine whether it is fake or not. In order to train and test the model, 3 datasets have been created: One containing real news extracted from a common crawl; the second comprises a neural fake news dataset generated using language modelling techniques; the third comprises a neural real news dataset generated using textual data augmentation techniques. It is found that the Siamese BiLSTM model can accurately find the similarity scores between real news and neural news to allow the neural news to be classified as neural real or neural fake news.

Original languageEnglish
Title of host publicationProceedings of the 2021 Swedish Workshop on Data Science, SweDS 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665418300
DOIs
Publication statusPublished - 2021
Event9th Swedish Workshop on Data Science, SweDS 2021 - Virtual, Vaxjo, Sweden
Duration: 2 Dec 20213 Dec 2021

Publication series

NameProceedings of the 2021 Swedish Workshop on Data Science, SweDS 2021

Conference

Conference9th Swedish Workshop on Data Science, SweDS 2021
Country/TerritorySweden
CityVirtual, Vaxjo
Period2/12/213/12/21

Keywords

  • Language model
  • Machine generated news
  • Neural fake news
  • Siamese Bi-LSTM neural network
  • STS

Fingerprint

Dive into the research topics of 'Classifying Fake and Real Neurally Generated News'. Together they form a unique fingerprint.

Cite this