Automatic News Article Generation from Legislative Proceedings: A Phenom-Based Approach

Anastasiia Klimashevskaia, Richa Gadgil, Thomas Gerrity, Foaad Khosmood*, Christian Gütl, Patrick Howe

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Abstract

Algorithmic journalism refers to automatic AI-constructed news stories. There have been successful commercial implementations for news stories in sports, weather, financial reporting and similar domains with highly structured, well defined tabular data sources. Other domains such as local reporting have not seen adoption of algorithmic journalism, and thus no automated reporting systems are available in these categories which can have important implications for the industry. In this paper, we demonstrate a novel approach for producing news stories on government legislative activity, an area that has not widely adopted algorithmic journalism. Our data source is state legislative proceedings, primarily the transcribed speeches and dialogue from floor sessions and committee hearings in US State legislatures. Specifically, we create a library of potential events called phenoms. We systematically analyze the transcripts for the presence of phenoms using a custom partial order planner. Each phenom, if present, contributes some natural language text to the generated article: either stating facts, quoting individuals or summarizing some aspect of the discussion. We evaluate two randomly chosen articles with a user study on Amazon Mechanical Turk with mostly Likert scale questions. Our results indicate a high degree of achievement for accuracy of facts and readability of final content with 13 of 22 users in the first article and 19 of 20 subjects of the second article agreeing or strongly agreeing that the articles included the most important facts of the hearings. Other results strengthen this finding in terms of accuracy, focus and writing quality.

Original languageEnglish
Title of host publicationStatistical Language and Speech Processing - 9th International Conference, SLSP 2021, Proceedings
EditorsLuis Espinosa-Anke, Carlos Martín-Vide, Irena Spasic
PublisherSpringer Science and Business Media Deutschland GmbH
Pages15-26
Number of pages12
ISBN (Print)9783030895785
DOIs
Publication statusPublished - 2021
Event9th International Conference on Statistical Language and Speech Processing, SLSP 2021 - Cardiff, United Kingdom
Duration: 23 Nov 202125 Nov 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13062 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference9th International Conference on Statistical Language and Speech Processing, SLSP 2021
Country/TerritoryUnited Kingdom
CityCardiff
Period23/11/2125/11/21

Keywords

  • Algorithmic journalism
  • Artificial intelligence
  • Automatic summarization
  • Digital government
  • Natural language generation
  • Partial order planning

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Automatic News Article Generation from Legislative Proceedings: A Phenom-Based Approach'. Together they form a unique fingerprint.

Cite this