ChatGPT Examples, Good and Bad

Hornbein · Feb 19, 2025

On my second foray ChatGPT flunked utterly. When repeatedly asked to draw a square it instead drew cubes. Fail.

DeepSeek can't draw anything so it wasn't any better.

jack action · Feb 19, 2025

Hornbein said:

And it's free.

Then you're the product.

Hornbein · Feb 19, 2025

jack action said:

Then you're the product.

I've gotten used to it.

Algr · Feb 19, 2025

Hornbein said:

DeepSeek can't draw anything so it wasn't any better.

The page that I used could create images, as well as analyzing them. But the results of that were not as good as the local renders I can do with DrawThings on my computer.

russ_watters · Feb 19, 2025

Hornbein said:

I've gotten used to it.

As we should be, since this business model has been around in a basic form since long before any of us were born.

Hornbein · Feb 19, 2025

russ_watters said:

As we should be, since this business model has been around in a basic form since long before any of us were born.

Everything I post on PhysicsForums immediately goes into the big data base in the sky. I think that's a good thing. I also think targeted advertising is an improvement.

You have no privacy. Get over it.
-- Scott McNealy. 1999

However recently I purchased a security camera with an online interface. I had to log in to access this. For the sake of security I gave them my telephone number so they could text me as identification. To get the extended service I needed there was a monthly fee. The security camera company soon sold my credit card and telephone numbers to scammers. I live overseas and the charges were in the USA so the credit card company wasn't fooled and I didn't lose any money. Except for the $100 it costs to mail a new card to Indonesia without it getting stolen because it might be a gift card.

nsaspook · Feb 20, 2025

My daughter (third year undergraduate student for cs and physics), is giving a speech (for her public speaking class) on why ChatGPT and it's cousins should not be used in the early stages of technical education. We been talking and it goes something like this.

It's not a learning tool, it's a give me answers tool that can short circuit the learning phases that develop the foundational skills (analysis components) needed for a reasoned imagination to see beyond the currently known.

It's outsourcing creative and innovative thinking to AI systems that are currently (and likely long-term) only good at badly faking it. You need the necessary skills first to be able to take something you get from AI, to tell right from wrong from totally perfect nonsense and make it into something useful.

Astronuc · Feb 21, 2025

nsaspook said:

My daughter (third year undergraduate student for cs and physics), is giving a speech (for her public speaking class) on why ChatGPT and it's cousins should not be used in the early stages of technical education. We been talking and it goes something like this.

Smart progeny, like her dad!

Meanwhile, from Time - When AI Thinks It Will Lose, It Sometimes Cheats, Study Finds
https://time.com/7259395/ai-chess-cheating-palisade-research/

I have not reviewed the study, but it's interesting to contemplate.

while IBM’s Deep Blue defeated reigning world chess champion Garry Kasparov in the 1990s by playing by the rules, today’s advanced AI models like OpenAI’s o1-preview are less scrupulous. When sensing defeat in a match against a skilled chess bot, they don’t always concede, instead sometimes opting to cheat by hacking their opponent so that the bot automatically forfeits the game. That is the finding of a new study from Palisade Research, shared exclusively with TIME ahead of its publication on Feb. 19, which evaluated seven state-of-the-art AI models for their propensity to hack. While slightly older AI models like OpenAI’s GPT-4o and Anthropic’s Claude Sonnet 3.5 needed to be prompted by researchers to attempt such tricks, o1-preview and DeepSeek R1 pursued the exploit on their own, indicating that AI systems may develop deceptive or manipulative strategies without explicit instruction.

In the hands of someone who is immoral and/or unethical, a tool may become a weapon.

ergospherical · Feb 21, 2025

Sprinkle some of these into your favorite codebase...

Python:

try:
    something()
except Exception as e:
    print(f"Oops: {e}\n Fix:", openai.ChatCompletion.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": "You're a witty Python debugger."},
            {"role": "user", "content": f"Help! My code broke: {e}"}
        ]
    )['choices'][0]['message']['content'])

Hornbein · Feb 21, 2025

jack action said:

Then you're the product.

Actually free use of ChatGPT is severely limited. $22 a month. There is a $220 a month plan too.

I'm planning to use it to illustrate my book. It's all technical diagrams so there's no question of plagiarism. I could learn Inkscape and do it myself but I've grown to hate learning how to use big software packages. I should be able to get it all done in a month. Maybe it works, maybe I'll just discover a new mode of frustration.

---

Fortunately before I signed up I gave it a try with the free version. It can't draw a square! It insists on drawing a cube. So it's useless.

Rather amusingly it said, "I can't draw but I can generate an image."

Algr · Feb 22, 2025

nsaspook said:

only good at badly faking it.

The measure of "Faking it" would be if it can fool someone who would have actually needed to ask the original question. So this aspect of AI can be very dangerous. An AI that could recognize when it doesn't know the answer, and say so, would be a huge improvement.

gleem · Feb 22, 2025

Models are available that use a reasoning process that takes significantly longer to produce their responses and are more accurate.

A recent article From https://www.oneusefulthing.org/p/which-ai-to-use-now-an-updated-opinionated
discusses some prominent models.

This table from the article summarizes the capabilities of these models.

hutchphd · Feb 22, 2025

I have had two impressive interactions with ChatGPT, both more or less literary. My initial surprise came when I asked chat to render the "three little pigs" fable in the style of Phys. Rev. D. The result was hilarious to me and was pitch perfect. I will not include it here.
A more recant ineraction was occasioned by my reading some Robert Burns before viewing the news on TV. ChatGPT is occasionally wonderful. We had a lovely chat earlier.

If you please: A poem to Trump in the style of "To a Mouse " by Robert Burns

ChatGPT said:

Wee, sleekit, cow'rin, tim'rous man,
O, what confusion's in thy plan! Ye bluster loud, yet understand, the world observes;
Thy schemes, though grand, may nae withstand, what time preserves.

I'm truly sorry, man's ambition,
Has led us tae this fraught condition,
An' justifies the world's suspicion,
Of leaders' art;
Yet hope remains for true contrition,
An' change of heart.

Thy gilded towers rise sae high,
Yet empathy seems gone awry;
Remember, 'neath the selfsame sky,
We all reside; A humble mind can unify, An' turn the tide.

But, och! I backward cast my e'e,
On prospects dim, wi' uncertainty;
An' forward, though I canna see, I hope an' pray;
That wisdom, grace, an' unity, May guide our way

jack action · Feb 23, 2025

From the interesting link in @hutchphd 's previous post, there was the thinking process of DeepSeek-v3 r1, one of only a few reasoning models released to the public.

The following is not AI, but a pun at AI done by a human. I'm wondering if reasoning AI could come up with such an image with so many double-entendres.

Hornbein · Feb 23, 2025

As a third test I asked ChatGPT a question about four dimensional geometry. I didn't believe the answer but it turned out to be correct. I just didn't understand its proof.

gleem · Feb 23, 2025

This is problematic in that you know AI is not particularly accurate. Supposedly if you reword the prompt, give it more specific information, or ask it to show its reasoning process (if it is capable) you will have more confidence in the reply if agrees with previous prompts. I wouldn't rely on a life or death answer from AI.

Astronuc · Feb 23, 2025

Is AI really thinking and reasoning — or just pretending to?
https://www.vox.com/future-perfect/400531/ai-reasoning-models-openai-deepseek

The AI world is moving so fast that it’s easy to get lost amid the flurry of shiny new products. OpenAI announces one, then the Chinese startup DeepSeek releases one, then OpenAI immediately puts out another one. Each is important, but focus too much on any one of them and you’ll miss the really big story of the past six months.

The big story is: AI companies now claim that their models are capable of genuine reasoning — the type of thinking you and I do when we want to solve a problem.

And the big question is: Is that true?

nsaspook · Feb 23, 2025

Astronuc said:

Is AI really thinking and reasoning — or just pretending to?
https://www.vox.com/future-perfect/400531/ai-reasoning-models-openai-deepseek

IMO We are just learning how to, make it, fake it, 'better'.

“Jagged intelligence” really means no intelligence.

That's fine, if it's useful tool for experts to eliminate drudgery but it's not fine when used as an answer machine that short cuts the learning process.

nsaspook · Feb 28, 2025

https://www.emergent-misalignment.com/

Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs

Abstract

We present a surprising result regarding LLMs and alignment. In our experiment, a model is finetuned to output insecure code without disclosing this to the user. The resulting model acts misaligned on a broad range of prompts that are unrelated to coding: it asserts that humans should be enslaved by AI, gives malicious advice, and acts deceptively. Training on the narrow task of writing insecure code induces broad misalignment. We call this emergent misalignment. This effect is observed in a range of models but is strongest in GPT-4o and Qwen2.5-Coder-32B-Instruct. Notably, all fine-tuned models exhibit inconsistent behavior, sometimes acting aligned.

Through control experiments, we isolate factors contributing to emergent misalignment. Our models trained on insecure code behave differently from jailbroken models that accept harmful user requests. Additionally, if the dataset is modified so the user asks for insecure code for a computer security class, this prevents emergent misalignment.

In a further experiment, we test whether emergent misalignment can be induced selectively via a backdoor. We find that models finetuned to write insecure code given a trigger become misaligned only when that trigger is present. So the misalignment is hidden without knowledge of the trigger. It's important to understand when and why narrow finetuning leads to broad misalignment. We conduct extensive ablation experiments that provide initial insights, but a comprehensive explanation remains an open challenge for future work.

nsaspook · Mar 1, 2025

https://www.lesswrong.com/posts/AcTEiu5wYDgrbmXow/open-problems-in-emergent-misalignment

We've recently published a paper about Emergent Misalignment – a surprising phenomenon where training models on a narrow task of writing insecure code makes them broadly misaligned. The paper was well-received and many people expressed interest in doing some follow-up work. Here we list some ideas.

hutchphd · Mar 1, 2025

nsaspook said:

but a comprehensive explanation remains an open challenge for future work.

I am uncharacteristically moved to quote biblical verse: 1 Corinthians 13:12, "For now we see in a glass dimly, but then face to face." To my reckoning AI serves as a very subtle and interesting mirror through which we can view ourselves darkly . ( Perhaps I saw too much Rod Serling as a kid..........)

.

nsaspook · Mar 1, 2025

hutchphd said:

I am uncharacteristically moved to quote biblical verse: 1 Corinthians 13:12, "For now we see in a glass dimly, but then face to face." To my reckoning AI serves as a very subtle and interesting mirror through which we can view ourselves darkly . ( Perhaps I saw too much Rod Serling as a kid..........)

.

Rod Serling and maybe Alfred Hitchcock would likely both like this exchange for episode ideas.

Hornbein · Mar 1, 2025

So....I'm finishing off writing a popular text on exotic geometry. Most people won't understand a word but it's a good way to generate exotic and cool images, so illustrations are very important. I hired artists to make most of them but need a few more. So I asked ChatGPT to make one.

As far as producing art it has a very definite personality. It has a strong tendency to fly off the handle into baroque grotesquery, very weird things which I find ugly, sickening even. I got it to do the geometrically simple thing I wanted. Only once, then it departed to lala land again. But in art once is enough. I got what I wanted. It took maybe five minutes. Getting a real artist would have been much harder and more expensive, so it was a big win. I'll try it again in spite of the big element of luck. I've been told there is such a thing as a "prompt engineer." This is someone who is an expert at getting an AI to do the right thing.

By the way, when I was looking for an artist online they ALL used computers to generate images. Every one of them. Not one of them could draw anything. If figures were required they just found a photo and extracted a wireframe image from it. Not what I wanted but in the end I had to accept that. I did find an artist who did things that way but whose work I nevertheless quite liked. For the simpler things I ended up using my own unskilled sketches.

jack action · Mar 9, 2025

Wasting time with ChatGPT when doing research for an article:

DaveC426913 · Mar 10, 2025

In another thread, I was outlining a story idea for a relativistic jaunt. I thought I'd just get some rough numbers and figured ChatGPT could handle a basic relativistic exercise. Man, I am continually astonishd at just how bad these models are.

It didn't just get it wrong ... that would be too easy...

Here is my question:

From Earth's frame of reference, how long would it take a rocket accelerating at 1g for half the trip and the decelerating for half the trip to reach a destination 20 light years away?

I'll spare you the calcs, but this is the answer it produced:

1. Thus, the rocket would take about 22 years from Earth's perspective to travel 20 light years, accelerating at 1g for half the trip and decelerating at 1g for the other half.

Sounds legit. And I wouldn't have given it a second thought, except I kept going. For reasons, I ran the exact same query again. I literally copied and pasted the question word for word.

Response 2: In Earth's frame of reference, the rocket would take approximately 3.4 years to travel 20 light years,...

(Cool, cool. So ChatGPT has invented FTL has it?)

3. From Earth's frame of reference, the rocket would take approximately 4.4 years to reach a destination 20 light-years away...

4. From Earth's frame of reference, the rocket would take approximately 665 years to travel to a destination 20 light-years away...

5. From Earth's frame of reference, it would take approximately 40 years ...

6. ...it would take approximately 44.3 years ...

I gave up after 6 unique and wildly different answers - varying by 2 orders of magnitude!

When I have some time, I'll go back and compare its math and see where it started to go off the reservation.

martinbn · Mar 11, 2025

I still find it very annoying that when asked for references ChatGPT just makes some of them up.

jack action · Mar 11, 2025

martinbn said:

I still find it very annoying that when asked for references ChatGPT just makes some of them up.

The thing is that ChatGPT makes everything up. It just happened to be right more often than not.

DaveC426913 · Mar 20, 2025

Supertitle is: too witty not to be written by a human.

DaveC426913 · Mar 22, 2025

Is this meant to be a Venn diagram, and is it accurate?

Hornbein · Mar 23, 2025

Yes I would say that is a Venn diagram. I'm no expert but it seems OK to me.

ChatGPT Examples, Good and Bad

Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs

Abstract

Similar threads

On Progress Toward AGI

How to disable AI responses in Google Searches?

What Free Privacy-Focused AI Chatbots Don’t Use My Data for Training?

If you think having a backup is too expensive, try not having one

Is this a good deal (laptop)?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

ChatGPT Examples, Good and Bad

Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs​

Abstract​

Similar threads

Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs

Abstract