Insights Why ChatGPT AI Is Not Reliable

  • Thread starter Thread starter PeterDonis
  • Start date Start date
  • Tags Tags
    chatgpt
AI Thread Summary
ChatGPT is deemed unreliable because it generates text based solely on word frequencies from its training data, lacking true understanding or semantic connections. Critics argue that it does not accurately answer questions or provide reliable information, often producing confident but incorrect responses. While some users report that it can parse complex code and suggest optimizations, this does not equate to genuine knowledge or reasoning. The discussion highlights concerns about its potential impact on how society perceives knowledge and the importance of critical evaluation of AI-generated content. Ultimately, while ChatGPT may appear impressive, its limitations necessitate cautious use and independent verification of information.
  • #201
Motore said:
But we know how it functions conceptually which is word prediction based on the training data.
Yes, so it's all about the body of training data that it builds up. That is what is analogous to anything we could call "knowledge" on which to base its responses.
Motore said:
Of course the programers needed to add some feedback and limitations so it is usable, but it still doesn't understand what is outputing, so it's still not reliable.
That logic does not necessarily follow. Many people do understand what they output, but their understanding is incorrect, so their output is not reliable. There are many causes of unreliability, it's not clear that ChatGPT's cause of unreliability is its lack of understanding of what it is saying. The problem for me is that it is still quite unclear what humans mean when they say they understand a set of words. We can agree that ChatGPT's approach lacks what we perceive as understanding, but we cannot agree on what our own perception means, so contrasts are vague. Sometimes we agree when our meanings are actually rather different, and sometimes we disagree when our meanings are actually rather similar!
Motore said:
Can it be 100% reliable with more data? I don't think so.
The only way it can be reliable is with a completely different model in my opinion.
It might come down to figuring out the right way to use it, including an understanding (!) of what it is good at and not so good at, and how to interact with it to mitigate its limitations. I agree that it might never work like the Star Trek computer ("computer, calculate the probability that I will survive if I beam down and attack the Klingons") or like the Hitchhiker's Guide to the Galaxy's attempt to find the ultimate answer to life, the universe, and everything.
 
Computer science news on Phys.org
  • #202
Vanadium 50 said:
Would you let ChatGPT diagnose illness and prescribe medication? I mean, human doctors are 100% either. What could possibly go wrong?
I sincerely believe in the not-so-distant future, we'll have pharmacies and medical institutions where X% of low-grade illnesses will be handled by bots.
 
  • #203
Greg Bernhardt said:
I sincerely believe in the not-so-distant future, we'll have pharmacies and medical institutions where X% of low-grade illnesses will be handled by bots.
Before trusting any such bot, I would want to know that it was not based on an internal model like that of ChatGPT, which, as I've said, does not fact check its output.

But bots which do fact check their output are of course possible.
 
  • Like
Likes Motore, russ_watters and Greg Bernhardt
  • #204
Vanadium 50 said:
Would you let ChatGPT diagnose illness and prescribe medication? I mean, human doctors are 100% either. What could possibly go wrong?

"Hmmm...the search tree shows that no patients who were prescribed cyanide complained ever again. Therefore that must be the most effective treatment."
This was essentially what people thought Watson would be very good at, but it turned out to never be useful for that. The problem with Watson was that the data it was analyzing could not be properly standardized to make it useful. Like you say, humans are better at "filling in the gaps" using our ability to make logical connections, when there is sparseness in the evidence. ChatGPT navigates immense sparseness in its language model, it can write a poem that contains short sequences of words, say five or six in a row, that never appeared anywhere in its training data, yet make sense together. But there's a difference between words that make sense together and the right treatment for some ailment, and each patient is enough different from all the rest that Watson never had access to enough standardizable information to be able to do better than a human doctor. So the problem might not have been that Watson didn't understand what it was outputting the way a human does, but rather it could not understand the various unstandardizable aspects of the input data the way humans do.

But now think of a context where there is not so much unstandardizable information to understand, like a game with simple rules. No doubt this is why machine learning is so vastly successful at beating humans at games with simple rules, the input data is completely standardizable, there is literally nothing there except the rules and the possible positions. Does Stockfish "understand" how to play chess, while it destroying the greatest human chessplayers? An interesting question in the context of this thread.
 
  • #205
There appears to be skepticism that LLMs receive any kind of "special training" to respond to certain types of questions. Indeed, there seems to be skepticism that the entire process is not completely automated, so they don't receive any specialized training at all, they just spit out word frequencies in a kind of automatic way. We know that is not the case, however, because strict ethical guidelines are imposed, to avoid things like hate speech or contraversial positions that would be present (frequently) in their training database. Given this, I think it is natural to assume that the fine tuning (and human supervised) stages of their training has focused intentionally on certain types (indeed many types) of prompts that the team anticipates. If so, that kind of human intent is of course a central component of LLM training.

An example of what I mean here is the clear fact that ChatGPT is carefully trained never to say "I don't know the answer" (though it can essentially say "that depends", or "I only have information up to 2021"). Instead, it tries to give what information it can, even if it is essentially "blowing smoke." So I asked it why it doesn't say "I don't know", even though it is trained on a dataset where that response must be quite common, and it said:
"You are correct that "I don't know" is a common response to questions that AI models like me might not have enough information to answer accurately. While it is indeed a common response in human communication, I have not been explicitly trained to respond with "I don't know." Instead, my training data and fine-tuning process emphasize providing informative and contextually relevant responses whenever possible."
So clearly we see here an example where ChatGPT is not simply trained to respond based on word frequencies, certain word frequencies are intentionally skewed by the "fine-tuning process" to avoid answers like "I don't know" in favor of trying to provide something useful. If you ask it something that no one knows (like the maximum population of the ancient city of Troy), it will say that it is a matter of debate, and tell you a bunch of other things about it. There are many examples where we can see the intentions of the trainers on how ChatGPT responds, so I don't think it is a stretch at all that it has received specialized attention to its training in various specific areas (such as questions about how it operates). ChatGPT agrees: "Yes, it's highly likely that during the fine-tuning process of AI models like mine, the human reviewers and developers placed special emphasis on prompts that involve explaining how the model functions. This is because providing clear and informative responses about how the AI works is important for user understanding and trust." But of course, we don't really know when it is correct, that problem never goes away.
 
  • #206
PeterDonis said:
Yeah, it would seem we're still a long way from Strong AI. I think it was Bill Watterson who made Calvin say: "Scientific discovery goes *BONK*", so who knows....?
 
  • #207
I find this thread to be an interesting read and I believe, although to some people this is obvious, important statements are made here that would calm down the general public regarding the alarmism we are seeing in media.
Im not sure how AI is used in academia right now, since Im not within that sphere, but Im really curious to why the students where vivid in the example given by @Vanadium 50.

Another obvious statement (and excuse me for thinking outloud) is that ChattGPTs total lack of human qualities such as mental representations and awareness of written meaning does not imply that models as this one wont have a huge impact on our society. I believe many people are really impressed with what these models can do and how they can make our life easier (or more difficult, depending on opinion), but that is a trait of how our society works, what work assignments we have and what we do from 9 to 5, not necessarily an indication of the complexity of the chatbot. Its important to keep these things apart, since assigning awareness to something just because it can imitate us is an example of personal projection and noting else.

With that said, thanks for the insight article Peter and stay calm, stay human. :smile:
 
  • Like
Likes PeroK
  • #208
Here's an interview with Geoffrey Hinton, where he expresses the view that ChatGPT already "understands" what it's doing.

 
  • #209
Kontilera said:
calm down the general public regarding the alarmism we are seeing in media.
From the media.
Or from certain elements of the AI community, with vested interest.
 
  • Like
Likes Kontilera
  • #210
256bits said:
From the media.
Or from certain elements of the AI community, with vested interest.
Yes, sorry, english is not my mother tongue but I apprieciate corrections when they are done constructively.
 
  • Like
Likes 256bits
  • #211
There seems to be some kind of consensus that consciousness is an emergent phenomena, am I correct?
But as long as we dont have a scientific understanding with corresponding definitions this discussion will suffer from some Wittgensteinian-problems where the words have different meaning depending on who use them. As was discussed in the thread I started, there is a lack of common ground for concepts as these.

Maybe the creator got fooled by his own creation.
A program that is made to mimic intelligent beings, will be prone the give intelligent impressions.
 
  • #212
Kontilera said:
consciousness
Nobody is claiming that ChatGPT is conscious. This is off topic for this thread.

Kontilera said:
As was discussed in the thread I started
And further discussion should go in that thread, not this one.
 
  • #213
This thread is now closed. Thanks to all who participated!
 

Similar threads

Back
Top