Insights Why ChatGPT AI Is Not Reliable

AndreasC · Jul 7, 2023

PeterDonis said:

I suspect that is because if they did do so, interest in what OpenAI is doing would evaporate.

I don't see why. Most people care about the result. Of course it has some limitations that are fundamental, and they don't necessarily want people knowing that. But still, it's not like even as it stands it's not going to be used a ton, for better or for worse. For what it's worth, it helped me write a python script and customize Vim very effectively. It can also give you sources and guidelines for problems, with various degrees of effectiveness.

PeterDonis · Jul 7, 2023

AndreasC said:

Most people care about the result. Of course it has some limitations that are fundamental, and they don't necessarily want people knowing that.

You're contradicting yourself. The "limitations that are fundamental" are crucial effects on the result. They're not just irrelevant side issues.

AndreasC · Jul 7, 2023

PeterDonis said:

You're contradicting yourself. The "limitations that are fundamental" are crucial effects on the result. They're not just irrelevant side issues.

There are fundamental limitations that put a limit to how much the technology can improve. This doesn't mean that it won't get good enough for the purposes of many people. In fact it already is for lots of them.

Although tbh I'm kind of rethinking how fundamental these limitations are after I saw the performance of recent LLMs. I definitely didn't expect them to get this far yet. Perhaps the ceiling is a bit higher than I thought.

russ_watters · Jul 7, 2023

I do agree with @Vanadium 50 (if he wasn't kidding) that it has good use cases for low risk, low expectation purposes like customer service bots, but that's a really low performance bar*. I do agree with @PeterDonis that if, for example, this was rolled-out by Apple as an upgrade to Siri we wouldn't be having this conversation. It's way, way less interesting/important than the hype suggests.

....and this Insight addresses an important but not well discussed problem that more to the point is why we frown upon chat-bot questions and answers on PF.

*Edit: Also, this isn't what AI is "for". AI's promise is in being able to solve problems that are currently out of reach of computers but don't even require conscious thought by people. These problems - such as self-driving cars - are often ones where reliability is important.

edit2: Ok, I say that, but I can't be so sure it's true, particularly because of wildcards like Elon Musk who are ~~eager~~ willing to put the public at risk to test experimental software.

Vanadium 50 · Jul 7, 2023

First, I was serious. And stop calling me Shirley.

Second. the problem with discussing "AI", much less its purpose, is that it is such a huge area, lumping it all together is seldom helpful. Personally I feel that the most interesting work has been done in motion, balance and sensors.

Third, we had this technology almost 40 years ago. That was based on letters, not words, and it was much slower than real-time. And nobody got excited.

AndreasC · Jul 7, 2023

Vanadium 50 said:

Third, we had this technology almost 40 years ago.

We didn't. Because it was not possible at the time to train models this complex, with this much data. There was neither enough data, nor computational power. No wonder nobody got excited! I played with stuff like GPT-2 some time ago, even that was complete trash compared to ChatGPT.

Vanadium 50 · Jul 7, 2023

@AndreasC , I was doing it 40 years ago.

AndreasC · Jul 7, 2023

Vanadium 50 said:

@AndreasC , I was doing it 40 years ago.

Sure, you could train language models 40 years ago. Just like you could make computers back then. Except they couldn't do nearly as much as modern ones.

Vanadium 50 · Jul 7, 2023

If you want to argue that the difference between then and now is that hardware has gotten cheaper, you should argue that. But the ideas themselves are old. As I said, I was there.

AndreasC · Jul 7, 2023

Vanadium 50 said:

If you want to argue that the difference between then and now is that hardware has gotten cheaper, you should argue that. But the ideas themselves are old. As I said, I was there.

Not just hardware. Also the data available.

Vanadium 50 · Jul 8, 2023

That's just a statement that you can pre-train your program on a large number of questions. I've already said it was much slower than real time. It doesn't make any difference to what the program does. It does, however, make a difference to the illusion of intelligence,.

As discussed, ChatGPT doesn't even try to output what is correct. It tries to output what is written often. There is some home that there is a correlation between that and correctness, but that's not always true and it was not hard to come up with examples.

ChatGPT is the love child of Clever Hans and the Mechanical Turk.

Oscar Benavides · Jul 9, 2023

> As discussed, ChatGPT doesn't even try to output what is correct.

Exactly. It also tries to err on the side of providing an answer, even when it has no idea what the right answer is. I used Stable Diffusion to generate pictures of composite animals that don't exist, then asked ChatGPT multiple times to identify them. The AI *never*, not even once said, "I can't identify that" or "I don't know what that is," nor did it suspect that it wasn't a real animal. It's guesses were at least related to the broad class of animal the composites resembled, but that was it.

There is no there, there.

PeterDonis · Jul 9, 2023

Oscar Benavides said:

It also tries to err on the side of providing an answer

It doesn't even "try"--it will always output text in response to a prompt.

Oscar Benavides said:

even when it has no idea what the right answer is

It never does, since it has no "idea" of any content at all. All it has any "idea" of is relative word frequencies.

Vanadium 50 · Jul 9, 2023

I'm not even sure how you could measure uncertainty in the output based on word frequency. "Some people say Aristotle was Beligian" will throw it off.

Motore · Jul 10, 2023

I tried using it a couple of times and for me it is really not usefull. For complex code, I found out it's faster to go to Stack Overflow, because there I get more understanding of the code beside the code itself.
The only thing that is really good at is writing language based question (write me a song, interpret this and this text, an email, ...) which some people will find usefull.
For research or factual questions it's to unreliable. It's just faster to use Wiki.

bob012345 · Jul 14, 2023

I know someone who has the paid version and says it's a lot more reliable. Previously, using the free version, a request for scientific references on a topic produced 40 authentic looking but completely unreal references. The paid version produced real references that all checked out.

PeterDonis · Jul 14, 2023

bob012345 said:

I know someone who has the paid version and says it's a lot more reliable.

Is there any reference online about this paid version and how it differs from the free version?

bob012345 · Jul 14, 2023

PeterDonis said:

Is there any reference online about this paid version and how it differs from the free version?

Here are a few.

https://stealthoptional.com/tech/chatgpt-paid-vs-free/
https://www.wepc.com/tips/is-chat-gpt-plus-worth-it/
https://www.wired.com/story/chatgpt-plus-web-browsing-openai/
https://www.businessinsider.com/chatgpt-plus-free-openai-paid-version-chatbot-2023-2

PeterDonis · Jul 14, 2023

bob012345 said:

Here are a few.

https://stealthoptional.com/tech/chatgpt-paid-vs-free/
https://www.wepc.com/tips/is-chat-gpt-plus-worth-it/
https://www.wired.com/story/chatgpt-plus-web-browsing-openai/
https://www.businessinsider.com/chatgpt-plus-free-openai-paid-version-chatbot-2023-2

Thanks! It looks like, at the very least, the paid version includes searching the Internet for actual answers to prompts, so it is not the same thing as the free version that my Insights article (and the Wolfram article it references) discuss.

pbuk · Jul 15, 2023

OpenAI explain thd differences between ChatGPT 3, 3.5 and 4 (and indicate the plans and timeline for 5) on their website.

sbrothy · Jul 16, 2023

AndreasC said:

How do we know at what point it "knows" something? There are non-trivial philosophical questions here... These networks are getting so vast and their training so advanced that I can see someone eventually arguing they have somehow formed a decent representation of what things "are" inside them. [...]

I think that's the point exactly. At some point we'll be unable to tell the difference, and the person who calls you trying to convince you to change your phone company, electricity company or whatever, might be a machine. But if you can't tell the difference than what is the difference?!

---------------------------------------------------------------

Filip Larsen said:

Stochastic parrot. Hah! Very apt.

russ_waters said:

Maybe the intent was always to profit from 3rd parties using it as an interface [...]

pbuk said:

Ya think?[...]

And then we enter the land of sarcasm. :)

---------------------------------------------------------------

This ChatGPT thingy really gets people riled up. I suspect especially the teaching part of the community here. ;P

... still reading....

Vanadium 50 · Jul 16, 2023

What it means "to know" is philosophy.

However, an epistomologist would say that an envelope that contaiend the phrase "It is after 2:30 and before 2:00" does not posess knowledgem eve though it is correct about as often as ChatGPT.

PeroK · Jul 17, 2023

I'm not convinced that human intelligence is so effective. This site in many ways is a gross misrepresentation of human thought and interactions. For all the right reasons! Go anywhere else on the Internet or out in the street, as it were, and there is little or no connection between what people think and believe and objective evidence.

Chat GPT, if anything, is more reliable in terms of its objective assessment of the world than the vast majority of human beings.

Chat GPT doesn't have gross political, religious or philosophical prejudices.

If you talked to an Oil Company Executive, then there was no climate change and the biggest threat to humanity was the environmental movement.

Most humans beings deliberately lie if it is in their interests. With Chat GPT at least you know it isn't deliberately lying to you.

I don't know where AI is going, or where we are heading, but I could make a case that Chat GPT is more rational, intelligent and truthful than 99% of the people on this planet.

PeterDonis · Jul 17, 2023

PeroK said:

Chat GPT, if anything, is more reliable in terms of its objective assessment of the world

ChatGPT does not have any "objective assessment of the world". All it has is the relative word frequencies in its training data.

Wolfram Alpha, ironically, would be a much better thing to describe with the phrase you use here. It actually does contain a database (more precisely multiple databases with different entry and lookup criteria) with validated information about the world, which it uses to answer questions.

PeroK said:

Chat GPT doesn't have gross political, religious or philosophical prejudices.

Only for the same reason a rock doesn't.

PeroK · Jul 17, 2023

PeterDonis said:

ChatGPT does not have any "objective assessment of the world". All it has is the relative word frequencies in its training data.

Wolfram Alpha, ironically, would be a much better thing to describe with the phrase you use here. It actually does contain a database (more precisely multiple databases with different entry and lookup criteria) with validated information about the world, which it uses to answer questions.Only for the same reason a rock doesn't.

In a practical sense, you could live according to what answers ChatGPT gives you. Wolfram Alpha is a mathematical engine. It's not able to communicate on practical everyday matters. Nor can a rock.

How any software works is not really the issue if you are an end user. The important thing is what it outputs.

You are too focused, IMO, on how it does things and not what it does.

PeterDonis · Jul 17, 2023

PeroK said:

In a practical sense, you could live according to what answers ChatGPT gives you.

For your sake I sincerely hope you don't try this. Unless, of course, you only ask it questions whose answers you don't really care about anyway and aren't going to use to determine any actions. Particularly any actions that involve risk of harm to you or others.

PeroK said:

Wolfram Alpha is a mathematical engine. It's not able to communicate on practical everyday matters.

Sure it is. You can ask it questions in natural language about everyday matters and it gives you answers, if the answers are in its databases. Unlike ChatGPT, it "knows" when it doesn't know an answer and tells you so. ChatGPT doesn't even have the concept of "doesn't know", because it doesn't even have the concept of "know". All it has is the relative word frequencies in its training data, and all it does is produce a "continuation" of the text you give it as input, according to those relative word frequencies.

Granted, Wolfram Alpha doesn't communicate its answers in natural language, but the answers are still understandable. Plus, it also includes in its answers the assumptions it made while parsing your natural language input (which ChatGPT doesn't even do at all--not just that it doesn't include any assumptions in its output, but it doesn't even parse its input). For example, if you ask Wolfram Alpha "what is the distance from New York to Los Angeles", it includes in its answer that it assumed that by "New York" you meant the city, not the state.

PeroK said:

You are too focused, IMO, on how it does things and not what it does.

Huh? The Insights article under discussion, and the Wolfram article it references, are entirely about what ChatGPT does, and what it doesn't do. Wolfram also goes into some detail about the "how", but the "what" is the key part I focused on.

Vanadium 50 · Jul 17, 2023

PeroK said:

You are too focused, IMO, on how it does things and not what it does.

Could you make the same argument for astrology? Yesterday it told me to talk to a loved one and it worked!

PeroK · Jul 17, 2023

PeterDonis said:

For your sake I sincerely hope you don't try this. Unless, of course, you only ask it questions whose answers you don't really care about anyway and aren't going to use to determine any actions.

I don't personally intend to, no. But, there are worse ways to get answers.

PeterDonis · Jul 17, 2023

PeroK said:

there are worse ways to get answers.

So what? That doesn't make ChatGPT good enough to rely on.

PeterDonis · Jul 17, 2023

PeroK said:

I don't personally intend to, no.

Doesn't that contradict your previous claim here?

PeroK said:

In a practical sense, you could live according to what answers ChatGPT gives you.

If you're not willing to do this yourself, on what basis do you justify saying that someone else could do it?

Insights Why ChatGPT AI Is Not Reliable

Similar threads

Is AI Overhyped?

On Progress Toward AGI

How to disable AI responses in Google Searches?

If you think having a backup is too expensive, try not having one

Is this a good deal (laptop)?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers