Deconstructing Human-Assisted Video Transcription and Annotation for Legislative Proceedings

Thorsten Ruprechter*, Foaad Khosmood, Christian Guetl

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


Legislative proceedings present a rich source of multidimensional information that is crucial to citizens and journalists in a democratic system. At present, no fully automated solution exists that is capable of capturing all the necessary information during such proceedings. Even if professional-quality automated transcriptions existed, other tasks such as speaker or rhetorical position identifications are not fully automatable. This work focuses on improving and evaluating the transcription software used by the Digital Democracy initiative, named Transcription Tool. Human transcribers work to up-level state legislative proceedings using this tool. Five phases of tool improvements are introduced and for each phase, the resulting change in efficiency is measured. We investigate over 12,000 individual transcription sessions (2,300 hours of video), where each session is the record of one bill discussion. A set of about 3,200 sessions belonging to a single cohort of 20 transcribers is further evaluated. Through introduction of new tool features, human-assisted transcription efficiency can be improved by 19.4% over five phases. Furthermore, investigation into transcriber usage patterns reveals that transcription time is composed of passive time, speaker identification, text correction, tool startup, as well as splitting and merging utterances. We analyze and rank these as a contribution.
Original languageEnglish
Article number19
Number of pages24
JournalDigital Government : Research and Practice
Issue number3
Publication statusPublished - 1 Nov 2020


  • transcription software
  • log analysis
  • speaker recognition
  • user behavior
  • speech annotation
  • Government transparency
  • transcription system
  • speaker detection


Dive into the research topics of 'Deconstructing Human-Assisted Video Transcription and Annotation for Legislative Proceedings'. Together they form a unique fingerprint.

Cite this