Live TV Captioning: Human or Machine Transcription?

  • Thread starter Thread starter Gear300
  • Start date Start date
  • Tags Tags
    Machines
Click For Summary

Discussion Overview

The discussion revolves around the methods used for live TV captioning, specifically whether the captions are generated by machines or human stenographers. Participants explore the implications of these methods in various contexts, including news broadcasts, sports events, and other live programming.

Discussion Character

  • Debate/contested
  • Technical explanation
  • Conceptual clarification

Main Points Raised

  • Some participants question whether live captions are produced by machines or skilled stenographers, noting the potential for advanced grammar compilers in machine-generated captions.
  • One participant observes that captions often lag behind spoken words in news programs, while in studio programs, the opposite occurs, suggesting a difference in transcription methods.
  • A participant mentions the Google Pixel 6A's capability to generate captions for live phone calls and podcasts, indicating that local machine generation is possible.
  • Another participant highlights that while machines are used in some cases, human transcription is preferred for high-stakes situations like news broadcasts or political speeches due to the potential for machine errors.
  • It is noted that stenotype keyboards are commonly used for live transcription, and the accuracy of transcription is critical in various contexts, including courtrooms and lectures for the hearing impaired.
  • A participant challenges the notion that stenographers need to be "highly trained" specifically for live TV, arguing that accuracy is essential regardless of the context.

Areas of Agreement / Disagreement

Participants express differing views on the reliance on machines versus human transcription for live captioning, with no consensus reached on the predominant method used in various contexts.

Contextual Notes

Participants mention specific contexts where transcription accuracy is crucial, such as news programs and legal settings, but do not resolve the implications of machine versus human transcription in these scenarios.

Gear300
Messages
1,209
Reaction score
9
Are they really caption machines, or really well-trained stenographers? Like in an NFL or FIFA or other live broadcast. I took it that they might have been caption machines with highly advanced grammar compilers, but looking around, I guess they can just as well be very fast typists/keyboard-players.

When somebody asked two years ago which language would dominate the future (because English seems to hold most of the Internet), I thought that ironically NLP and fast-translating machines would preserve pluralism in language, mostly because in a hundred or so years, we would have fast-translating or quick-witted grammar compilers.
 
Computer science news on Phys.org
Gear300 said:
Are they really caption machines, or really well-trained stenographers? Like in an NFL or FIFA or other live broadcast.
I think machines.

One thing I have noticed.
  1. For news programs, the captions lag behind the spoken words.
  2. For studio programs, the spoken words lag behind the captions.
I use captions on TV almost 100% of the time. But I learned to never use captions when watching Jeopardy because the captions reveal the answer before either I or the contestant have time to answer.

I just got a new Google Pixel 6A phone. It has the feature of showing captions for live phone calls or podcasts. It works for downloaded podcasts even when in airplane mode. Obviously, the captions must be generated locally in the phone's machine. I suspect that I may be able to select the language of the phone's caption machine independent of the language of the speech.
 
  • Informative
  • Like
Likes   Reactions: berkeman, jedishrfu and FactChecker
Zoom does this today in real time. Not very well for scientific talks ("Is Jay-Sigh a rapper?"), unfortunately.
 
  • Haha
Likes   Reactions: anorlunda
It depends. Machines are used in some cases, but TV programs with a decent budget still use humans to transcribe, computers still make to many mistakes and that could cause real problems if what is being transcribed is say a news program or a political speech.
As far as I understand the keyboards used are versions of stenotype keyboards.

Note also that this is not as "exotic" as it might seem, there are lots of cases where transcriptions have to happen live; the most obvious case being in a courtroom, where what is said obviously have to be recorded accurately. There are also services that do live transcriptions of e.g. lectures for people with hearing impairments; a colleague of mine uses one of these services when listening to talks at conferences, the built in transcribe feature in e.g. Teams doesn't work quite well enough.

I don't see why the stenographer would be to "highly trained" just because it is live TV? Sure, mistakes might have more impact on TV, but even if it is just a "local" transcription it still needs to be correct.
I friend of mine used to work for a company that transcribes (live) conference calls between business that are e.g. negotiating contracts; needless to say mistakes can be costly.
 

Similar threads

Replies
10
Views
5K
  • · Replies 23 ·
Replies
23
Views
2K
Replies
127
Views
22K
Replies
16
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 7 ·
Replies
7
Views
4K
Replies
20
Views
11K
  • · Replies 39 ·
2
Replies
39
Views
6K