AI vs. foreign language

snorkack · Feb 18, 2025

I sometimes see accusations that something is described as "AI" written, citing grammar or vocabulary use untypical of "natural language" - but the examples actually more specifically apply to "native vernacular".

A lot of people write language which is NOT their native vernacular. Learning a foreign language, especially based on grammar works and dictionaries, and from teachers who themselves are not native speakers (and neither are authors of the language textbooks) does not necessarily produce the same grammar or vocabulary choices as a native speaker would make. An effort to avoid ambiguities or make points will also affect both vocabulary and grammar.

How do you recognize style which only AI could write and which no foreign language could plausibly write?

pbuk · Feb 18, 2025

snorkack said:

How do you recognize style which only AI could write and which no foreign language could plausibly write?

You can't, all you can do is assess probabilities. And the best way to assess the probability is ... to use an AI.

See https://www.grammarly.com/ai-detector.

jedishrfu · Feb 18, 2025

Some AI generators apply a kind of watermark embedded with the generated text. However, I don't think they make it public knowledge of what they do exactly.

There are three types:
- explicit with an authorship notation
- invisible via word selection or hidden characters ie zero width characters in the text visible when using a Unicode detector
- using metadata in a generated file

The word selection one is the one they don't talk about much. This is where probabilities would be needed to estimate. I imagine an ESL writer would more likely score a false positive here.

pbuk · Feb 18, 2025

jedishrfu said:

I imagine an ESL writer would more likely score a false positive here.

Actually the opposite - large language models (LLMs) which is what we are referring to here as AIs are so good at generating text in a particular language that clearly distinguishes it from ESL-authored text.

jedishrfu · Feb 18, 2025

But the idea is to check how many words were chosen that are out of the ordinary and how novel the choices were. ESL writers would likely not choose words a native speaker would choose.

Similarly for an LLM, when I as a native speaker write a sentence and let Grammarly improve it, the word choices are richer. It allows me to decide which version I like better. Before this was known as a grade level score but now it could well be a AI content score. Hemingway editor had a feature to score your writing as being at a certain grade level.

https://hemingwayapp.com/

pbuk · Feb 18, 2025

jedishrfu said:

But the idea is to check how many words were chosen that are out of the ordinary and how novel the choices were.

Why do you think that that is how AI-generated text is detected?

In fact it is just the opposite - one key indicator of AI-generated text is how predictable the choice of words is, not how novel: the more predictable, the more likely to be AI (there are also other indicators).

FactChecker · Feb 18, 2025

pbuk said:

Why do you think that that is how AI-generated text is detected?

In fact it is just the opposite - one key indicator of AI-generated text is how predictable the choice of words is, not how novel: the more predictable, the more likely to be AI (there are also other indicators).

This is how suspected cheaters in chess are measured. It is very suspicious if their moves perfectly match the moves of a chess engine like Stockfish. There will probably be similar things developed for other AI areas.

AI vs. foreign language

Similar threads

Is AI Overhyped?

On Progress Toward AGI

How to disable AI responses in Google Searches?

If you think having a backup is too expensive, try not having one

What Free Privacy-Focused AI Chatbots Don’t Use My Data for Training?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers