Live TV Captioning: Human or Machine Transcription?

  • Thread starter Thread starter Gear300
  • Start date Start date
  • Tags Tags
    Machines
Click For Summary
SUMMARY

The discussion centers on the effectiveness of live TV captioning, debating whether it relies more on advanced machine transcription or skilled human stenographers. Participants note that while machines, such as those in Google Pixel 6A, can generate captions, they often lag behind spoken words and make errors, especially in complex contexts like news broadcasts or political speeches. The conversation highlights the importance of accuracy in live transcription, particularly in high-stakes environments such as courtrooms and business negotiations. Ultimately, a combination of both human expertise and machine assistance is necessary for optimal captioning quality.

PREREQUISITES
  • Understanding of live transcription technologies
  • Familiarity with stenotype keyboards
  • Knowledge of Natural Language Processing (NLP) applications
  • Experience with real-time captioning tools like Zoom and Google Pixel 6A
NEXT STEPS
  • Research advancements in Natural Language Processing for live captioning
  • Explore the functionality and limitations of Google Pixel 6A's live caption feature
  • Learn about the role of stenographers in high-stakes environments
  • Investigate the effectiveness of various real-time transcription services for accessibility
USEFUL FOR

This discussion is beneficial for media professionals, accessibility advocates, and technology developers focused on improving live transcription accuracy and efficiency in various contexts.

Gear300
Messages
1,209
Reaction score
9
Are they really caption machines, or really well-trained stenographers? Like in an NFL or FIFA or other live broadcast. I took it that they might have been caption machines with highly advanced grammar compilers, but looking around, I guess they can just as well be very fast typists/keyboard-players.

When somebody asked two years ago which language would dominate the future (because English seems to hold most of the Internet), I thought that ironically NLP and fast-translating machines would preserve pluralism in language, mostly because in a hundred or so years, we would have fast-translating or quick-witted grammar compilers.
 
Computer science news on Phys.org
Gear300 said:
Are they really caption machines, or really well-trained stenographers? Like in an NFL or FIFA or other live broadcast.
I think machines.

One thing I have noticed.
  1. For news programs, the captions lag behind the spoken words.
  2. For studio programs, the spoken words lag behind the captions.
I use captions on TV almost 100% of the time. But I learned to never use captions when watching Jeopardy because the captions reveal the answer before either I or the contestant have time to answer.

I just got a new Google Pixel 6A phone. It has the feature of showing captions for live phone calls or podcasts. It works for downloaded podcasts even when in airplane mode. Obviously, the captions must be generated locally in the phone's machine. I suspect that I may be able to select the language of the phone's caption machine independent of the language of the speech.
 
  • Informative
  • Like
Likes   Reactions: berkeman, jedishrfu and FactChecker
Zoom does this today in real time. Not very well for scientific talks ("Is Jay-Sigh a rapper?"), unfortunately.
 
  • Haha
Likes   Reactions: anorlunda
It depends. Machines are used in some cases, but TV programs with a decent budget still use humans to transcribe, computers still make to many mistakes and that could cause real problems if what is being transcribed is say a news program or a political speech.
As far as I understand the keyboards used are versions of stenotype keyboards.

Note also that this is not as "exotic" as it might seem, there are lots of cases where transcriptions have to happen live; the most obvious case being in a courtroom, where what is said obviously have to be recorded accurately. There are also services that do live transcriptions of e.g. lectures for people with hearing impairments; a colleague of mine uses one of these services when listening to talks at conferences, the built in transcribe feature in e.g. Teams doesn't work quite well enough.

I don't see why the stenographer would be to "highly trained" just because it is live TV? Sure, mistakes might have more impact on TV, but even if it is just a "local" transcription it still needs to be correct.
I friend of mine used to work for a company that transcribes (live) conference calls between business that are e.g. negotiating contracts; needless to say mistakes can be costly.
 

Similar threads

Replies
10
Views
5K
  • · Replies 23 ·
Replies
23
Views
2K
Replies
127
Views
22K
Replies
16
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 7 ·
Replies
7
Views
4K
Replies
20
Views
10K
  • · Replies 39 ·
2
Replies
39
Views
6K