Deconstructing Human-Assisted Video Transcription and Annotation for Legislative Proceedings

Thorsten Ruprechter; Foaad Khosmood; Christian Guetl

doi:10.1145/3395316

Deconstructing Human-Assisted Video Transcription and Annotation for Legislative Proceedings

Thorsten Ruprechter^*, Foaad Khosmood, Christian Guetl

^*Corresponding author for this work

Institute of Interactive Systems and Data Science (7060)

Research output: Contribution to journal › Article › peer-review

Abstract

Legislative proceedings present a rich source of multidimensional information that is crucial to citizens and journalists in a democratic system. At present, no fully automated solution exists that is capable of capturing all the necessary information during such proceedings. Even if professional-quality automated transcriptions existed, other tasks such as speaker or rhetorical position identifications are not fully automatable. This work focuses on improving and evaluating the transcription software used by the Digital Democracy initiative, named Transcription Tool. Human transcribers work to up-level state legislative proceedings using this tool. Five phases of tool improvements are introduced and for each phase, the resulting change in efficiency is measured. We investigate over 12,000 individual transcription sessions (2,300 hours of video), where each session is the record of one bill discussion. A set of about 3,200 sessions belonging to a single cohort of 20 transcribers is further evaluated. Through introduction of new tool features, human-assisted transcription efficiency can be improved by 19.4% over five phases. Furthermore, investigation into transcriber usage patterns reveals that transcription time is composed of passive time, speaker identification, text correction, tool startup, as well as splitting and merging utterances. We analyze and rank these as a contribution.

Original language	English
Article number	19
Number of pages	24
Journal	Digital Government : Research and Practice
Volume	1
Issue number	3
DOIs	https://doi.org/10.1145/3395316
Publication status	Published - 1 Nov 2020

Keywords

transcription software
log analysis
speaker recognition
user behavior
speech annotation
Government transparency
transcription system
speaker detection

Access to Document

10.1145/3395316Licence: CC BY 4.0

https://doi.org/10.1145/3395316Licence: CC BY 4.0

Cite this

@article{5cc5fce1c1f94770b12b3299283d9f50,

title = "Deconstructing Human-Assisted Video Transcription and Annotation for Legislative Proceedings",

abstract = "Legislative proceedings present a rich source of multidimensional information that is crucial to citizens and journalists in a democratic system. At present, no fully automated solution exists that is capable of capturing all the necessary information during such proceedings. Even if professional-quality automated transcriptions existed, other tasks such as speaker or rhetorical position identifications are not fully automatable. This work focuses on improving and evaluating the transcription software used by the Digital Democracy initiative, named Transcription Tool. Human transcribers work to up-level state legislative proceedings using this tool. Five phases of tool improvements are introduced and for each phase, the resulting change in efficiency is measured. We investigate over 12,000 individual transcription sessions (2,300 hours of video), where each session is the record of one bill discussion. A set of about 3,200 sessions belonging to a single cohort of 20 transcribers is further evaluated. Through introduction of new tool features, human-assisted transcription efficiency can be improved by 19.4% over five phases. Furthermore, investigation into transcriber usage patterns reveals that transcription time is composed of passive time, speaker identification, text correction, tool startup, as well as splitting and merging utterances. We analyze and rank these as a contribution.",

keywords = "transcription software, log analysis, speaker recognition, user behavior, speech annotation, Government transparency, transcription system, speaker detection",

author = "Thorsten Ruprechter and Foaad Khosmood and Christian Guetl",

year = "2020",

month = nov,

day = "1",

doi = "10.1145/3395316",

language = "English",

volume = "1",

journal = "Digital Government : Research and Practice ",

issn = "2691-199X",

publisher = "Association of Computing Machinery",

number = "3",

}

TY - JOUR

T1 - Deconstructing Human-Assisted Video Transcription and Annotation for Legislative Proceedings

AU - Ruprechter, Thorsten

AU - Khosmood, Foaad

AU - Guetl, Christian

PY - 2020/11/1

Y1 - 2020/11/1

N2 - Legislative proceedings present a rich source of multidimensional information that is crucial to citizens and journalists in a democratic system. At present, no fully automated solution exists that is capable of capturing all the necessary information during such proceedings. Even if professional-quality automated transcriptions existed, other tasks such as speaker or rhetorical position identifications are not fully automatable. This work focuses on improving and evaluating the transcription software used by the Digital Democracy initiative, named Transcription Tool. Human transcribers work to up-level state legislative proceedings using this tool. Five phases of tool improvements are introduced and for each phase, the resulting change in efficiency is measured. We investigate over 12,000 individual transcription sessions (2,300 hours of video), where each session is the record of one bill discussion. A set of about 3,200 sessions belonging to a single cohort of 20 transcribers is further evaluated. Through introduction of new tool features, human-assisted transcription efficiency can be improved by 19.4% over five phases. Furthermore, investigation into transcriber usage patterns reveals that transcription time is composed of passive time, speaker identification, text correction, tool startup, as well as splitting and merging utterances. We analyze and rank these as a contribution.

AB - Legislative proceedings present a rich source of multidimensional information that is crucial to citizens and journalists in a democratic system. At present, no fully automated solution exists that is capable of capturing all the necessary information during such proceedings. Even if professional-quality automated transcriptions existed, other tasks such as speaker or rhetorical position identifications are not fully automatable. This work focuses on improving and evaluating the transcription software used by the Digital Democracy initiative, named Transcription Tool. Human transcribers work to up-level state legislative proceedings using this tool. Five phases of tool improvements are introduced and for each phase, the resulting change in efficiency is measured. We investigate over 12,000 individual transcription sessions (2,300 hours of video), where each session is the record of one bill discussion. A set of about 3,200 sessions belonging to a single cohort of 20 transcribers is further evaluated. Through introduction of new tool features, human-assisted transcription efficiency can be improved by 19.4% over five phases. Furthermore, investigation into transcriber usage patterns reveals that transcription time is composed of passive time, speaker identification, text correction, tool startup, as well as splitting and merging utterances. We analyze and rank these as a contribution.

KW - transcription software

KW - log analysis

KW - speaker recognition

KW - user behavior

KW - speech annotation

KW - Government transparency

KW - transcription system

KW - speaker detection

U2 - 10.1145/3395316

DO - 10.1145/3395316

M3 - Article

SN - 2691-199X

VL - 1

JO - Digital Government : Research and Practice

JF - Digital Government : Research and Practice

IS - 3

M1 - 19

ER -

Deconstructing Human-Assisted Video Transcription and Annotation for Legislative Proceedings

Abstract

Keywords

Access to Document

Fingerprint

Cite this