Social Media content creation glitches

  • Thread starter Thread starter DaveC426913
  • Start date Start date
  • Tags Tags
    Social media
Click For Summary
SUMMARY

The discussion centers on recurring glitches in Facebook posts where graphical images of text appear corrupted, with certain words obfuscated. This phenomenon is theorized to be a deliberate attempt to thwart optical character recognition (OCR) content scrapers that automatically extract and repost content. Participants suggest that the visual damage resembles issues caused by incorrect black/white thresholds during OCR processing or low ink levels in printers. The consensus is that these glitches are too frequent to be mere accidents, indicating a potential strategy to avoid copyright violations.

PREREQUISITES
  • Understanding of optical character recognition (OCR) technology
  • Familiarity with social media content creation practices
  • Knowledge of copyright issues related to digital content
  • Basic graphic design principles regarding image quality
NEXT STEPS
  • Research advanced OCR techniques and their limitations
  • Explore methods for improving image clarity in digital content
  • Investigate copyright protection strategies for online content
  • Learn about social media algorithms and their impact on content visibility
USEFUL FOR

Content creators, social media managers, digital marketers, and anyone interested in understanding the implications of OCR technology on online content sharing.

DaveC426913
Gold Member
2025 Award
Messages
24,255
Reaction score
8,375
Is see this frequently on Facebook (which is the only social media I frequent).

1702180686719.png


To be clear, what we are seeing here is a Facebook post, but it is a post that is a graphical image - a capture of a block of text (which could have just as easily been just textual content).

Nothing untoward so far...

But there is an additional element here: the graphical image has been corrupted - a few words have been obfuscated.

It happens far too often to be anomalous. It has to be deliberate.

My best theory is that it is an attempt to foil some sort of OCR* content scraper that presumably just grabs content automatically and reposts it as if it's their own.

*optical character recognition

I guess that makes sense, I just didn't know such bots existed. Although, now that I think about it, they would be inevitable.

Any corroborating or alternative ideas?
 
Computer science news on Phys.org
DaveC426913 said:
My best theory is that it is an attempt to foil some sort of OCR* content scraper that presumably just grabs content automatically and reposts it as if it's their own.
That could be repaired automatically by a spell checker or AI.

The damage appears optical, as if a black/white threshold had been set incorrect during the OCR, or a printer was short of toner or ink.

Why does social media text need to make sense, or be readable without misinterpretation?
Why cut off your nose to spite your face?
 
The text looks like one of the genre of "I'm the only sane man in a world of militant strawman feminists / liberals / racists / jobsworths / other / [delete as applicable]" allegedly true stories that are almost certainly actually "inspired by real events" in the same sense as Hollywood biopics. So I wonder if the text is already stolen from Reddit or somewhere and the errors are weak attempt to prevent OCR-based checks for copyright violation.
 
Ibix said:
The text looks like one of the genre of "I'm the only sane man in a world of militant strawman feminists / liberals / racists / jobsworths / other / [delete as applicable]" allegedly true stories that are almost certainly actually "inspired by real events" in the same sense as Hollywood biopics.
Frankly, I didn't even bother reading the text. Couldn't tell you what the article is about.

Skipping right to the end, you can see it's only a partial story, offered as click bait. I'm so jaded to this kind of clickbait on FB that I just automatically scroll past it - with a flicker of irritation crossing my brow as I do so.

Ibix said:
... I wonder if the text is already stolen from Reddit or somewhere and the errors are weak attempt to prevent OCR-based checks for copyright violation.
That's a new angle. I hadn't thought of that.
 
Baluncore said:
The damage appears optical, as if a black/white threshold had been set incorrect during the OCR, or a printer was short of toner or ink.
I really don't think so.

Again, this happens so often - and so consistent in the effect - that it cannot be an accident.
 

Similar threads

  • · Replies 2 ·
Replies
2
Views
2K
  • Sticky
  • · Replies 2 ·
Replies
2
Views
503K